Getting Started with File Dumps


42matters generates file data dumps of the mobile app meta-data of apps and charts available on Google Play and iTunes and makes it available to its customers for download.

File Dumps Available

Data Format

Data is store in a single gzipped file with line delimited JSON with the following characteristics:

  • Each line is a valid JSON object
  • UTF-8 encoding
  • Line separator is '\n'

Access Credentials

In order to obtain your Access Credentials, please read the File Dumps page or contact us.

Note: Do not share your access credentials publicly such as in emails, source control, chats etc. This will lead to having your credentials disabled.

Authentication

Clients require AWS S3 credentials in order to obtain the data files. Your credentials are available for you in your 42matters account under Launchpad.

To make sure that the credentials are working well try out with a tool such as Cyberduck. With recent releases a vital feature to enter the path (bucket) has been removed from Cyberduck, that is why we suggest an older version - e.g. Cyberduck v4.8.1. Enter path (bucket) ­ external.42matters.com and the access credentials from your account. You will be able to navigate to the target location such as /1/42apps/v0.1/production/itunes/ or /1/42apps/v0.1/production/playstore.

Cyberduck

In order to programmatically download the feed files use a client library for AWS S3 in the language of your choice. E.g for Python we recommend Boto3 and more specifically S3Transfer tool for bulk downloads.

Automation with AWS CLI

Here we show an example of how to use awscli to list bucket's contents and download a standard playstore dump for a particular date.

# 1) Install awscli - https://aws.amazon.com/cli/
pip install awscli --upgrade --user

# 2) Configure awscli with the credentials we've provided in your account.
aws configure --profile YOUR_COMPANY

# 3) list the contents of a folder
aws s3 ls s3://external.42matters.com/1/42apps/v0.1/production/playstore/lookup/ --profile YOUR_COMPANY

# 4) list the contents of the timestamped folder
aws s3 ls s3://external.42matters.com/1/42apps/v0.1/production/playstore/lookup/2018-08-16/ --profile YOUR_COMPANY

# 5) download the file locally
aws s3 cp s3://external.42matters.com/1/42apps/v0.1/production/playstore/lookup/2018-08-16/playstore-00.tar.gz playstore.tar.gz --profile YOUR_COMPANY

# 6) unpack the file
tar xvfz playstore.tar.gz
            

Automation with Python Boto3

Here we show an example of how to use boto3 with S3Transfer to download a standard playstore dump for a particular date. This code snippet is written in Python.

import boto3
from boto3.s3.transfer import S3Transfer

s3_client = boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key,use_ssl=True)
s3_transfer = S3Transfer(s3_client)
s3_transfer.download_file(bucket='external.42matters.com', key="1/42apps/v0.1/production/playstore/lookup/2018-08-16/playstore-00.tar.gz", filename='playstore.tar.gz')

Examine file dump data

A good command-line tool for playing with the line-separated json files is JQ, here a couple of terminal commands to get you started after you unpack the files:

Print first 100 app title for GPlay apps

head -n 100 playstore-00 | jq -c '.title' -r

Print all titles for iTunes apps

head -n 100 itunes-00 | jq -c '.trackCensoredName' -r

Last Modified: 2018-11-11


Automate your workflow


Bring app data into your existing workflow, dashboards, CRM, messaging platforms and many more services! Here a selection of third-party services we support:



Email

Salesforce

HubSpot

Slack

Intercom

Pipedrive

Zendesk

Gekoboard

Klipfolio

Dynamics


And many more!


GET FREE INTEGRATIONS via Zapier!