Ashby to S3 Data Share
8 min
the aws s3 data share functionality provides a simple and flexible way for enterprise customers to access their ashby data directly in amazon s3 this option is designed for teams who prefer to work with data in their own aws environment, with the flexibility of consuming and querying that data by downstream systems access is this available on my plan? the aws s3 data share is only available to customers on the enterprise plan foundations legacy plus plus enterprise ❌ ❌ ❌ ✅ overview what you can expect once set up, ashby will deliver a complete, structured export of your data into a dedicated s3 bucket in your aws account the data will be organized in a consistent format so your team can query it using any engine or platform of your choice (e g aws athena, redshift or google bigquery) the data will be refreshed daily and each cycle would deliver a complete copy of your dataset to your s3 bucket this approach removes the need for you to handle deduplication, merging or change tracking setup what we’ll need from you if you are interested in the s3 data share, please reach out to your customer success manager to start the setup process to activate the s3 data share, you’ll need to provision and provide the following aws side items the name of an s3 bucket in your aws account where ashby can deliver the data the arn of an iam role that ashby can assume for write only access the primary contact(s) and technical owners to be included in future s3 data share update notifications we’ll share templates and examples to make this setup straightforward what you’ll see in s3 on each sync cycle, your bucket will receive a set of parquet files representing the entirety of your ashby data, structured in the following format s3 //\<bucket name>/ashby shares/\<table name>/sync dtm=\<yyyymmddhhmmss>/ parquet these exports are organized to make downstream use simple whether you’re loading them into a warehouse or querying directly with an engine like athena with each successful sync, we will also update a manifest file on your bucket which would include the following the sync dtm for the latest successful sync (i e the most up to date folder for each table in the structure above) the list of tables and the row count for each synced table, with their column names and data types listed the list of included tables and their schema are also available on https //docs google com/spreadsheets/d/1jtezzsosdjag9rvxdvvumlriusjeollo4bwwe3 ffhi/edit?gid=1316200980#gid=1316200980 related data sharing options if your organization uses snowflake, you can also choose our enterprise snowflake data share feature, which provides direct access to your ashby data from within your own snowflake instance see docid\ pg1llokjqbrln5goehn2c for more details faq how often does the s3 data refresh? the s3 data share refreshes data once per day is the data provided as a delta sync or snapshot? a full snapshot of your ashby is provided each day with the date of the snapshot provided in the folder structure what data is provided in the data share? the list of included tables and their schema are available on https //docs google com/spreadsheets/d/1jtezzsosdjag9rvxdvvumlriusjeollo4bwwe3 ffhi/edit?gid=1316200980#gid=1316200980