Summary
The University of California, Berkeley and San Francisco combined are one of the largest buyers in the Bay Area. Historically, it has been a time-consuming process to analyze suppliers' proposed price files and ensure the University is not paying more than contracted. Through the use of Pandas and Python, this once tedious and manual process can routinely be done in a matter of a few seconds.
Description
For the University of California, Berkeley and San Francisco, a routine management process of supplier price files used to be a time-consuming process. It is essential to analyze the sometimes tens of thousands of items a supplier offers to make sure the University doesn't accept larger price increases than is in compliance with a contract. A historical figure of past purchases is matched against the current and proposed catalogs and then analyzed to ultimately find out the percentage increase and number of products removed. Each Universities' motivation is to not accept a file that has larger price increases than contracted nor a file with several previously purchased products removed.
To combat the tedious and time-consuming process of manually analyzing the previous spend with the current and proposed files, a Python script was written. This heavily uses Pandas as well as Numpy for computations. The code uploads all three files as a dataframes and creates a common variable to compare similar products. It matches what was previously purchased to the identical products in the current and proposed catalogs. After filtering any 'bad' input that would skew the results, several values are computed and the code outputs the necessary figures to determine if a supplier's price file is acceptable. The code even documents each catalog result automatically so the historical changes are organized and noted in a csv.
This code is an exponential improvement to the manual process that was historically done. The end numbers are known in a matter of seconds as opposed to hours of Excel or Access analysis. For some suppliers, Excel is even incapable of uploading the entire catalog, thus making any analysis nearly impossible. Python and Pandas have not only made the analysts time more efficient but have opened the door for several possibilities.
Although this code has greatly improved this continuous analysis, more advanced techniques could potentially improve the process. The department soon hopes to use a forecasted spend figure rather than a historical snapshot to project spend against the proposed catalog. Moving forward, having the analysts armed with Python knowledge, Strategic Sourcing hopes to yield more meaning through the daily flow of spend data through machine learning techniques.