Introducing Web Scraping in Price Data Collection
The International Comparison Program (ICP) team at ESCWA’s has been exploring new techniques to enhance price statistics in the region by using technology and Big Data tools as a non-traditional data collection method complementing traditional data collection. This initiative aims to improve data quality, raise the frequency of data collection, reduce field visits and increase work efficiency by ensuring regular automated extraction of price data from online outlets for both CPI and ICP purposes.
Under this framework, a training workshop on web scraping price data from major online outlets was conducted at the Information and eGovernment Authority (iGA) in Bahrain to train the national team and build its capacities in web scraping to complement the ongoing traditional data collection process.
The training was attended by 6 participants from the Information and eGovernment Authority (iGA) in Bahrain and focused on:
- Introducing Big Data and non-traditional data collection methods like scanner data and web scraping, while highlighting the benefits, challenges, and potential uses of each method in price statistics.
- Demonstrating web scraping method for data extraction from major online outlets in Bahrain.
-Teaching and training the participants on coding and writing the web scraping algorithms in Python language from scratch, using different structures and scripts according to the nature of each online outlet.
- Hands-on application on web scraping, where each participant had the chance to utilize the transferred knowledge and customize the web scrapers to automate data extraction for online outlets in Bahrain.
In addition, a session focusing on National Accounts was conducted to improve the participants’ knowledge on various issues, including its framework, classification, data compilation and validation.