« Back to Blog

How to open the 42matters sample file dumps with Excel

How to open the 42matters sample file dumps with Excel

42matters provides app information and insights via an up-to-date, complete and consistent file feed, optimized for large-data ingestion. In this tutorial we are going to show you how to get started with the sample android app metadata enterprise dumps. The structure and format is the same as the official file dumps and can be easily reused.

The only tool that we will need is called JQ. It can be used to slice, filter and convert your data. JQ can be installed on most platforms and instructions vary, so we leave that to the reader.

Download the sample standard file dump locally. Sample files are updated daily so you will always be working with fresh data. If you prefer using the command line, you can execute the following command:

curl https://s3.amazonaws.com/external.42matters.com/1/42apps/v0.1/sample/playstore/lookup/latest/playstore-00.tar.gz -o playstore-00.tar.gz

Unpack the file. Usually that can be done by double-clicking on the file, but if you prefer command line you can run the following:

tar xvfz playstore-00.tar.gz

At this point you should have a file called playstore-00. It contains Android app metadata in line-separated json format. This is great for machine processing and a bit less readable for us humans. So to get a glimpse into the data inside we will transform it to a CSV file, that can be opened with e.g. Excel or Google Sheets.

Refer to the Android App Object to get an idea of which fields are available. We are interested in the apps'package_name, title, rating and number of ratings. The following command extracts those attributes from the line-separated json file, line-by-line and prints them in CSV format.  

jq "[.package_name, .title, .rating, .number_ratings] | @csv" -r  playstore-00 > playstore.csv

‍After the successful execution, you will have a well-formatted "csv" file that can be opened with any tool that supports this format, as as Excel.

With this we conclude today's tutorial. Feel free to check out all of 42matters's file dump and API offerings to see how we can make your business better.