Lance Snider Logo
lance
snider

Case Study

Zyft: Snapshot Testing for XPath & Metadata Logic that Needs to Work across 40k+ Domains

The Zyft Chrome and Safari extension needs to reliably detemine if web pages from over 40,000 retailers contain a valid product. This is done by examining the page's metadata. For domains that don't contain useful metadata, Zyft uses pre-defined XPaths.

Challenges:

  • It's wild how differently every web page's metadata is named and structured.
  • A small change to improve one domain may negatively affect thousands of others.
  • XPaths are brittle and can break when the page's HTML changes.
  • You could never write enough unit tests to confidently make changes to the logic.

Solution:

I wrote a script that takes a list of domains, saves the HTML's, runs the existing metadata & xpath logic, and saves the results. Running the tests to see if the new results match the previous allows me to safely make changes.

Every time we stumble across a problematic new domain, I can run a script to add it, then test it individually or all domains at once.

Result:

Since implementing snapshot testing, we've been able to improve the metadata to the point where we rarely need to rely on XPaths.