Ingestion of email messages from old archives

Doug Chanin
Doug Chanin
  • Updated

Topic

This article explains Dropsuite's ingestion service for archived email messages.

Environment

Dropsuite

Description

Email ingestion is the process of importing email data in bulk in the form of EML or PST files into the Dropsuite system. Dropsuite provides this service at no charge to our customers. Refer to the sections below for more information on how our ingestion process works.

How the Dropsuite Free Ingestion Service Works

Once we have received your request and forwarded it to our service delivery team, a team member will contact you through our support ticketing system. The team member will introduce themselves and let you know which time zone they are working in, and how often you can expect updates. Typically, you can expect a response within two business days after we have transferred your support ticket to our service delivery team.

Amazon S3 Bucket Credentials

Dropsuite uses Amazon S3 (Simple Storage Service) buckets for secure, and durable cloud-based storage. To transfer your files, you'll need Amazon S3 bucket credentials. Your Dropsuite team member will obtain the required credentials and then send them to the email address associated with your account. We will not post your credentials in your support ticket.

If you need the credentials sent to a different contact person, notify the support team when you first contact them, and they will update the email address to which we send your credentials.

Along with the credentials, you will receive a short guide and a download link for the third-party Amazon S3 browser tool. We recommend using Amazon's S3 Browser for accessing and interacting with S3 Buckets, because it is a free tool that is easy to use.

Dropsuite does not provide support for the S3 Browser, as it is third-party software. We have no control over factors such as upload speed or file size limits when using this tool.

How to Upload Your Data

Once you receive the Amazon S3 credentials, follow these steps to upload your data:

  1. Download S3 Browser from S3 Browser Download (external link).
  2. Install the application.
If you have a large amount of data to upload, consider purchasing the S3 Browser Pro version, which offers concurrent upload capability for faster uploads.
  1. Launch S3 Browser. 
    When prompted, enter the Access Key ID and Secret Access Key provided in the email. Be sure to input the details exactly as they appear, without any extra characters.
  2. Enter the bucket name.
    Enter the bucket name in all caps with no spaces. For example: REPLACEWITHBUCKETNAME.
  3. Upload your files.
    Navigate to the uploads folder in S3 Browser. You can now start uploading the necessary ZIP files for ingestion. Dropsuite only accepts zipped EML or PST files for ingestion.
The credentials link provided in the email will expire in 7 days, so we recommend that you securely store the credentials locally for future use.

Essential Points to Keep in Mind

Folder Structure Retention

  • If you upload your EML or PST files with an existing folder structure, the ingestion system will retain that structure.
  • The system will ingest files on an "as-is" basis. If your files are uploaded within additional top-level directories (for example, if you zip your files in a particular folder format), the system will ingest them in that exact structure.
  • To avoid retaining any folder or sub-folder structure, upload the EML or PST files directly to the upload folder in the Amazon S3 bucket.

Ingestion Process

  • Only zipped EML files or PST files are accepted for ingestion. PST files should not be zipped. The ingestion system will disregard any other file types. Any non-EML items you include in the files, such as contacts (VCF) or calendar appointments and meetings (ICAL), will not be ingested. The ingestion system will ignore these items, even if present within PST files.
  • Dropsuite uses de-duplication technologies during the ingestion process. This process can reduce the amount of data ingested because some of the data may already exist in our databases. Due to this activity, the volume of data ingested cannot be directly compared to the data source provided as proof that the ingestion completed successfully.

Compression Software and File Format

  • EML files should be zipped, but do not zip PST files.
  • We only support the use of Zip or Gzip compression, but not 7-Zip. Our application does not currently support 7-Zip compression.
  • To ensure a smooth ingestion process, make sure that your data contains only EML or PST files. The system cannot process other formats.

Overview of Ingestion Types

Dropsuite supports two types of ingestions: archiver and non-archiver. We explain each type below:

Archiver Ingestion

You would typically use an archiver ingestion when you cannot link the email data to a specific user. This situation often happens when the source files are a mix of EML files from multiple accounts, and the exact user association is unknown. In such cases, you create a "catch-all" email account and then import all the data into this single account. During the ingestion process, the system will map the EML files to the catch-all account or attempt to determine the association with individual accounts based on the provided data.

An example of archiver ingestion is when you have a large collection of unsorted EML files and cannot determine which file belongs to which user. In this case, you can upload all the data into a single account (the catch-all) to preserve the information.

Non-Archiver Ingestion

In a non-archiver ingestion, the mapping between source files and email accounts is already predetermined, as you know exactly which files correspond to which email accounts before starting the ingestion process.

The system will directly ingest the files into their designated email accounts, eliminating the need for manual sorting.

Example: You have a folder of PST files, and have labeled each file according to the user account it belongs to. This scenario allows for a straightforward ingestion into the respective accounts.

Mapping File Guidelines

For the ingestion process to function correctly, you will need to provide a mapping file to associate the source files with the respective email accounts. Each ingestion type requires a different kind of mapping file.

Dropsuite cannot build, upload, or manage mapping files on your behalf. We cannot be held responsible for data ingested into the wrong destination. The ingestion process is fully automated, and Dropsuite personnel do not have access to the system during this process. Make sure to use the comma delimiter in your CSV file. If you use any other delimiter, the ingestion will not work. If this happens, you will need to fix your mapping file and upload it again.

Archiver Ingestion Mapping File

You would typically use archiver Ingestion when you cannot link the email data to a specific user. This scenario typically occurs when the source files are a mix of EML files from multiple accounts, and the exact association with users is unknown. In this scenario, the mapping file will be simpler to fill in.

The sample mapping file you will receive for an archiver ingestion will look like this:

File Name
micro_file_sample.zip
sample.pst
micro_sample.pst

The table below describes the fields in the mapping file and the data you need to enter into each field.

Field Name Description
file_name This field should contain the filenames of the email data to be ingested. It must include the full filename with the correct extension. If the file is in subfolders, include the full file path.

Non-Archiver Ingestion Mapping File

In non-archiver ingestion, the mapping between the source files and email accounts is predetermined because you know exactly which files correspond to which email accounts before you start the ingestion process. In this scenario, the mapping file will require additional information.

The sample mapping file you will receive for a non-archiver ingestion will look like this:

account_id email_account_id email file_name
XXXXXX test1@xyz.onmicrosoft.com test_emls/test_sample.pst  
XXXXXX test2@xyz.onmicrosoft.com test_emls/test_eml.zip  

The table below explains the fields in the mapping file and the data you should enter into each field.

Field Name Description
account_id The account ID assigned to your Dropsuite organization.
email_account_id The email account ID. This field is only necessary when the user's email does not match the email on the tenant or if there are multiple users with the same email address.
email The user’s email address for which the data should be ingested.
file_name The filenames of the email data to be ingested. It must include the full filename with the correct extension. If the file is in subfolders, include the full file path.

File Names

The section applies to both archiver and non-archiver ingestions.

The file_name section of both mapping files should contain the filenames of the email data you need ingested and include the full filename with the correct extension. If the file is in subfolders, include the full file path.

The examples below explains the difference:

File uploaded directly to the S3 "uploads" folder (no folders)

account_id email_account_id email file_name
XXXXXX   test1@xyz.onmicrosoft.com sample.pst

File enclosed in multiple folders within the S3 bucket

account_id email_account_id email file_name
XXXXXX   test2@xyz.onmicrosoft.com FOLDER-NAME1/FOLDER-NAME2/sample.pst

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request