Topic
This article explains Dropsuite's ingestion service for archived email messages.
Environment
Dropsuite
Description
Email ingestion is the process of importing email data in bulk in the form of EML or PST files into the Dropsuite system. Dropsuite provides this service at no charge to our customers. Refer to the sections below for more information on how our ingestion process works.
- How the Dropsuite Free Ingestion Service Works
- Amazon S3 Bucket Credentials
- How to Upload Your Data
- Essential Points to Keep in Mind
- Overview of Ingestion Types
- Mapping File Guidelines
- File Names
How the Dropsuite Free Ingestion Service Works
Once we have received your request and forwarded it to our service delivery team, a team member will contact you through our support ticketing system. The team member will introduce themselves and let you know which time zone they are working in, and how often you can expect updates. Typically, you can expect a response within two business days after we have transferred your support ticket to our service delivery team.
Amazon S3 Bucket Credentials
Dropsuite uses Amazon S3 (Simple Storage Service) buckets for secure, and durable cloud-based storage. To transfer your files, you'll need Amazon S3 bucket credentials. Your Dropsuite team member will obtain the required credentials and then send them to the email address associated with your account. We will not post your credentials in your support ticket.
Along with the credentials, you will receive a short guide and a download link for the third-party Amazon S3 browser tool. We recommend using Amazon's S3 Browser for accessing and interacting with S3 Buckets, because it is a free tool that is easy to use.
How to Upload Your Data
Once you receive the Amazon S3 credentials, follow these steps to upload your data:
- Download S3 Browser from S3 Browser Download (external link).
- Install the application.
- Launch S3 Browser.
When prompted, enter the Access Key ID and Secret Access Key provided in the email. Be sure to input the details exactly as they appear, without any extra characters. - Enter the bucket name.
Enter the bucket name in all caps with no spaces. For example: REPLACEWITHBUCKETNAME. - Upload your files.
Navigate to the uploads folder in S3 Browser. You can now start uploading the necessary ZIP files for ingestion. Dropsuite only accepts zipped EML or PST files for ingestion.
Essential Points to Keep in Mind
Folder Structure Retention
- If you upload your EML or PST files with an existing folder structure, the ingestion system will retain that structure.
- The system will ingest files on an "as-is" basis. If your files are uploaded within additional top-level directories (for example, if you zip your files in a particular folder format), the system will ingest them in that exact structure.
- To avoid retaining any folder or sub-folder structure, upload the EML or PST files directly to the upload folder in the Amazon S3 bucket.
Ingestion Process
- Only zipped EML files or PST files are accepted for ingestion. PST files should not be zipped. The ingestion system will disregard any other file types. Any non-EML items you include in the files, such as contacts (VCF) or calendar appointments and meetings (ICAL), will not be ingested. The ingestion system will ignore these items, even if present within PST files.
- Dropsuite uses de-duplication technologies during the ingestion process. This process can reduce the amount of data ingested because some of the data may already exist in our databases. Due to this activity, the volume of data ingested cannot be directly compared to the data source provided as proof that the ingestion completed successfully.
Compression Software and File Format
- EML files should be zipped, but do not zip PST files.
- We only support the use of Zip or Gzip compression, but not 7-Zip. Our application does not currently support 7-Zip compression.
- To ensure a smooth ingestion process, make sure that your data contains only EML or PST files. The system cannot process other formats.
Overview of Ingestion Types
Dropsuite supports two types of ingestions: archiver and non-archiver. We explain each type below:
Archiver Ingestion
You would typically use an archiver ingestion when you cannot link the email data to a specific user. This situation often happens when the source files are a mix of EML files from multiple accounts, and the exact user association is unknown. In such cases, you create a "catch-all" email account and then import all the data into this single account. During the ingestion process, the system will map the EML files to the catch-all account or attempt to determine the association with individual accounts based on the provided data.
An example of archiver ingestion is when you have a large collection of unsorted EML files and cannot determine which file belongs to which user. In this case, you can upload all the data into a single account (the catch-all) to preserve the information.
Non-Archiver Ingestion
In a non-archiver ingestion, the mapping between source files and email accounts is already predetermined, as you know exactly which files correspond to which email accounts before starting the ingestion process.
The system will directly ingest the files into their designated email accounts, eliminating the need for manual sorting.
Example: You have a folder of PST files, and have labeled each file according to the user account it belongs to. This scenario allows for a straightforward ingestion into the respective accounts.
Mapping File Guidelines
For the ingestion process to function correctly, you will need to provide a mapping file to associate the source files with the respective email accounts. Each ingestion type requires a different kind of mapping file.
Archiver Ingestion Mapping File
You would typically use archiver Ingestion when you cannot link the email data to a specific user. This scenario typically occurs when the source files are a mix of EML files from multiple accounts, and the exact association with users is unknown. In this scenario, the mapping file will be simpler to fill in.
The sample mapping file you will receive for an archiver ingestion will look like this:
| File Name |
| micro_file_sample.zip |
| sample.pst |
| micro_sample.pst |
The table below describes the fields in the mapping file and the data you need to enter into each field.
| Field Name | Description |
| file_name | This field should contain the filenames of the email data to be ingested. It must include the full filename with the correct extension. If the file is in subfolders, include the full file path. |
Non-Archiver Ingestion Mapping File
In non-archiver ingestion, the mapping between the source files and email accounts is predetermined because you know exactly which files correspond to which email accounts before you start the ingestion process. In this scenario, the mapping file will require additional information.
The sample mapping file you will receive for a non-archiver ingestion will look like this:
| account_id | email_account_id | file_name | |
| XXXXXX | test1@xyz.onmicrosoft.com | test_emls/test_sample.pst | |
| XXXXXX | test2@xyz.onmicrosoft.com | test_emls/test_eml.zip |
The table below explains the fields in the mapping file and the data you should enter into each field.
| Field Name | Description |
| account_id | The account ID assigned to your Dropsuite organization. |
| email_account_id | The email account ID. This field is only necessary when the user's email does not match the email on the tenant or if there are multiple users with the same email address. |
| The user’s email address for which the data should be ingested. | |
| file_name | The filenames of the email data to be ingested. It must include the full filename with the correct extension. If the file is in subfolders, include the full file path. |
File Names
The section applies to both archiver and non-archiver ingestions.
The file_name section of both mapping files should contain the filenames of the email data you need ingested and include the full filename with the correct extension. If the file is in subfolders, include the full file path.
The examples below explains the difference:
File uploaded directly to the S3 "uploads" folder (no folders)
| account_id | email_account_id | file_name | |
| XXXXXX | test1@xyz.onmicrosoft.com | sample.pst |
File enclosed in multiple folders within the S3 bucket
| account_id | email_account_id | file_name | |
| XXXXXX | test2@xyz.onmicrosoft.com | FOLDER-NAME1/FOLDER-NAME2/sample.pst |