The United States Patent & Trademark Office offers free access to a number of intellectual property data sources. Among these is the Patent Examination Data System, or PEDS, which provides data about and accumulated during the examination of patent applications. A friend recently asked whether I could make some software to convert data downloaded from PEDS into a readable form. Knowing nothing about PEDS, I followed the link he provided and took a look.
PEDS is searchable through a web interface at https://ped.uspto.gov/peds and provides all of the aforementioned data for each search result. Unfortunately, only a preview of the resulting data including the first twenty results is viewable through the site. It’s necessary to download the data in JSON or XML format to access the remaining results.
No tools are provided for handling the downloaded data, so it’s left to the user to obtain or devise software to parse and format the data into a readable form. Failing that, one on a budget must resort to reading the unparsed data in a browser or editor, as I was unable to find any free or inexpensive tools for the purpose.
After some consideration and random poking around, I discovered Extensible Stylesheets Language for Transformations, or XSLT, and realized that an XSLT stylesheet might be used to do the job. For those not familiar with XSLT, an XSLT stylesheet can be used to transform an XML document into a different form, such as a neatly formatted HTML or XHTML document that can be viewed in an ordinary web broswer.
A good amount of nose-to-the-grindstoning culminated in a workable XSLT stylesheet, along with some scripts to make for easy generation of HTML files from XML data downloaded from PEDS. In contrast to the PEDS site preview, the resulting HTML files include the data for all of the search results. The data may be provided in a single file or by year in a plurality of files, depending on the user’s choice, and each file can be scrolled through to view the data without need to click drop-down links or tabs as required on the PEDS site.
The combination of stylesheet and scripts is called XSLT_PEDS. The latest version is available for download here. Requirements, setup, and usage information may be read online at https://github.com/dfyockey/XSLT_PEDS.
Below is an example comparing a portion of data as presented on the PEDS site, and a resulting HTML page generated using XSLT_PEDS. The PEDS site view on the left was pieced together from a plurality of screenshots, as it’s not possible to view data on the site in a scrollable or printable format. The XSLT_PEDS result view is a screenshot from a single, scrollable and printable webpage. If one downloads data for 100 results and choses to translate it into a single file, the data for each result is provided in the format shown separated by dashed lines, and one can easily scroll through all 100 to view the data.
|PEDS Data from USPTO’s site||PEDS Data generated by XSLT_PEDS|
Hopefully XSLT_PEDS will be of use to those wishing to more readily access the free data provided by the USPTO through the PEDS site. If you’re using it and happen to notice anything erroneous or missing between the data in the generated HTML pages and the data on the PEDS site, please let me know in a comment here or by filing an issue at https://github.com/dfyockey/XSLT_PEDS/issues.