What is Watchman?

The Watchman project implements an HTTP server and Go library for searching, parsing, and downloading lists. Below, you can find a detailed list of features offered by Watchman:

  • Download OFAC, BIS Denied Persons List, Consolidated Screening List, and various other data sources on startup
  • Index data for searches
  • Libraries for OFAC, US CSL, UK/EU CSL, and BIS DPL data to download and parse their custom files

Searching across all sanction lists Watchman uses the Jaro–Winkler algorithm to score the probability of each search query matching a list entry. This follows after what the US Treasury OFAC Search uses and what is recommended in academic literature.

FAQ

Entities from sanction lists and other data files are folded through various pre-computations prior to inclusion in the search index. This means the following steps will occur (in order):

  • SDN Reordering
    Each individual's SDN name is re-ordered (Example: from "MADURO MOROS, Nicolas" to "Nicolas MADURO MOROS").
  • Company Name Cleanup
    Suffixes from company names such as: "CO.", "INC.", "L.L.C.", etc are removed.
  • Stopword Removal
    Stopwords are removed. See bbalet/stopwords for a full list of supported languages and words subject to removal.
  • UTF-8 Normalization
    Punctuation is removed along with extra spaces on both ends of the entity name. Using Go's /x/text normalization methods we consolidate entity names and search queries for better searching across multiple languages.

Why are exact matches of words not ranked higher?

Watchman offers an environmental variable called EXACT_MATCH_FAVORITISM that can adjust the weight of exact matches within a query. This value is a percentage (float64) added to exact matches prior to computing the final match percentage. Try using 0.1, 0.25 or 0.5 with your testing.