How to get data in and out of Salesforce
Overview
- TL;DR What is your Salesforce RPO & RTO?
- Data Quality
- Export .csv from Salesforce
- FieldDump from AppExchange
- Issues with reading CSV in Excel
- Insert .CSV into Salesforce
- Wizard vs DataLoader
- In-built Data Import Wizard
- In-built Apex Data Loader
- 3rd-Party
- Java Apex Code
- REST API calls externally
- GitHub / Tools
- Learning Modules
- References
- More about Salesforce
This article is a succinct hands-on deep dive on how to get data in and out of Salesforce.
3 billion transactions go through Salesforce daily.
QUESTION: Please correct me if I’m wrong here.
There is a “dirty little open secret” about how companies actually work with Salesforce data.
Internally, Salesforce uses Oracle RAC databases, which “log ships” every change instantaneously to another region for disaster-proof recovery.
However, Salesforce does not provide a way for users to restore data from backups it has taken.
TL;DR What is your Salesforce RPO & RTO?
The IT industry-wide has names for recovery of data in case servers fail in enterprise data centers:
RTO (Recovery Time Objective) is the time that a company is willing to wait for data to be recovered. For Salesforce users not subscribing to a 3rd-party service, RTO would be in days or weeks rather than in minutes.
RPO (Recovery-Point Objective) is the amount of data that a company is willing to lose when restoring from disaster. Salesforce, by default, does not provide “point-in-time” recovery from “snapshots” of all data at various points in time.
Export of data using built-in Salesforce functionality is not automatically enabled when an account is established. And Salesforce limits full backups to once a week. Full sandboxes cost more money.
If you don’t run backup jobs, Salesforce charges a minimum of $10,000 to obtain that data, which will take a week or more.
Sandboxes for developer use are created only with metadata, not data.
Salesforce does not provide a comprehensive way to import CSV files exported. So intricate programming is needed, or pay thousands a month for 3rd-party utilities (described below).
(During cloning, existing users and their status and appends the name of the sandbox to logon Id’s, but keeps passwords.)
Salesforce encountered a data issue on May 17, 2019 related to Pardot.
Data Quality
This is covered in the 211 course and Trailhead module: Data Quality +700
- Assess, cleanse, and maintain data quality
- Prevent duplicate records using Duplicate Management
- Clean and enrich data with data.com, which uses external databases as the basis for cleaning Leads, Contacts, Accounts (not product Opportunities or Assets).
See validity.com #DemandTools, #PeopleImport, #DupeBlocker and #BriteVerify
Export .csv from Salesforce
There are several ways to obtain an export:
A. Use the Salesforce GUI
B. Use the FuseIT CLI program for Windows
FieldDump from AppExchange
FieldDump is a free add-on via AppExchange that extracts a Data Model to a spreadsheet readable by Microsoft Excel, Google Sheet, etc.
Export using Salesforce GUI
-
In Salesforce, enter Setup to look for “Export” (“Data Export”).
Notice object names are not sorted.
- Click checkbox “Include all data” under the “Weekly Export Service” heading.
- Click “Export Now”. Click “Schedule Now”
- Wait for an email with subject “Your Organization Data Export has completed.”
- In your email program (Outlook/Exchange, etc.), download the zip file by clicking the link. The file name is like “WE_00DG0000000gbbAMAQ_1.ZIP”.
Export using FuseIT
Salesforce Consultants FuseIT in New Zealand has a Salesforce Explorer
https://fuseit.com/Solutions/SFDC-Explorer/Help-Data-Exports.aspx that includes “Data Exports”. Their MongoDB for Salesforce (“M4S”) connector (super cache) for ASP.NET web app integration (which they call “G4S”).
Their CLI for Windows, after download and install, can be run by a scheduler to request exports with a command such as:
FuseIT.SFDC.DataExportConsole.exe \ /u:user@test.com /p:12345678 \ /t:C3P3qivIf5t6Q6uYtzxxxxxx \ /e:Production "C:\\SFDCexports"
.csv files exported
Within the export (backup) zip file are CSV files for each object. One file for each object and internal table. Remember:
- Export does not include any of your metadata (critical in rebuilding custom work and relationships).
- Export does not include attached files.
- Export can take a week or more to process through the queue.
- Exports are not allowed from sandboxes.
Export Backups in DE (Developer eXperience) orgs can occur Monthly only.
Backups in prod (and Sandboxes) can occur weekly or monthly.
CAUTION: Salesforce does not allow daily backups.
More frequent backups can occur using a 3rd party app on AppExchange.
Each .csv file cannot be large than 500MB.
Spreadsheet of .csv files
Our spreadsheet of objects exported has 331 data rows plus a header row.
Column names in the header row begin with an underline so they always sort to the top. Thus, column A is named “_Seq”,
The “_Backup File Name” column contains file names ending in .csv. The list was created by Salesforce after all objects were selected in the export form.
My list contains objects from several add-ons:
- CnP_… for “Click and Pledge”
- EventbriteSync__… for the Eventbrite activity calendar synchronization
- MC4SF__… for MailChimp for Salesforce (4SF) from AppExchange
CAUTION: Read the reviews to each add-on listed above.
_SEN
containing TRUE mark fields which contain sensitive information such as email address, birthdates, Social Security Numbers, bank account identifiers, and other personally identifiable information (PEI). These fields need to be cleansed when not in the production instance.
_Rows
counts the rows in each .csv file.
- “0” values are for blank files.
- “1” values are for files containing just the header row but no data.
- “7” would be for a file with 6 data records plus a header.
_LABEL
is the checkbox field lable in the Export form on Salesforce.
_API_NAME
and _DEPLOYED
are from the Object Manager at:
https://Account.lightning.force.com/lightning/setup/ObjectManager/home
Excel
Many use Microsoft Excel to create and edit CSV files for import into Salesforce.
There are some tricks to using it.
-
Excel can open CSV files automatically when it’s double-clicked on Finder.
-
Excel has no in-built way to specify the format of each field in CSV files. You’ll need to save the file in Excel format to manually specify number fields as such.
PROTIP: The leading zero in Zip codes get stripped automatically.
Double-quotes within text are problematic because they are also used to define the beginning and end of fields.
Issues with reading CSV in Excel
The issue with .csv files exported are these:
Data Types
Data type—primitive types: collections, sObjects, user-defined types, and built-in Apex types.
There are three main types of collections in Apex:
- Sets – unordered collection of elements that do not contain any duplicates.
- Lists – ordered collection of elements distinguished by indices.
- Maps – key-value pairs with each unique key mappings to a single value. Keys and values can be any data type—primitive type.
The import file should include a record owner for each record (defaulting to the account used to do importing).
Schema of Data Object Dependencies
PROTIP: One cannot just insert data of any given object from a .CSV file because of dependencies.
The Salesforce Console provides a dependency viewer at: ???
Click to pop-up full screen image*
There is more “smarts” with Master-detail vs. lookup:
- Only 2 are allowed per object (vs 25 lookups)
- Access to parent determines access to children.
- All users who can access a specific parent record will also has access to all the related child records (parent has private OWD setting and has sharing rules on top of it)
- Activities tagged to child records has to be shown on parent account too
- When the parent record is deleted all the child records attached are deleted.
- Rollup summary fields to parent based on SUM, AVG, MIN of child records
- A child of one master detail relationship cannot be the parent of another.
There are many-to-many relationships. These are defined using junction objects which combines in a custom object two master-detail relationships.
https://github.com/rsoesemann/salesforce-plantuml (by Robert Sösemann who ported the PMD extensible multilanguage static code analyzer to Salesforce) is an open-sourcce native Force.com application app that generates UML class & ER-diagrams from your org data. It leverages the PlantUML JavaScript Deflate and other libraries.
ALAS: Etherios EasyDescribe free app to view and extract object metadata by West Monroe Partners (of Chicago) is dated 2010 for Winter 11 / 1.91.0 and listed as “private”.
Layout Page free app from Clerisoft in 2015 converts any Salesforce Page Layout (Standard OR Custom) into a Standard Visualforce Page in just 3 steps.
References:
- https://www.youtube.com/watch?v=fUD4MzgA0gk
- http://certifiedondemand.com/lookup-or-master-detail-relationship-in-salesforce
- http://www.salesforcetutorial.com/object-relationships-in-salesforce/
- https://help.salesforce.com/HTViewHelpDoc?id=relationships_considerations.htm
- https://help.salesforce.com/HTViewHelpDoc?id=overview_of_custom_object_relationships.htm
- https://developer.salesforce.com/docs/atlas.en-us.api.meta/api/relationships_among_objects.htm
Insert .CSV into Salesforce
To insert data within .csv back into Salesforce, there are several approaches:
A. Use the Excel Connector B. Use the Salesforce Wizard or DataLoader C. Use a 3rd-party utility
For “DYIers” who don’t want to spend extra money but have the technical chops:
D. Write Java Apex code to run within a Salesforce Console E. Write REST API calls within a custom web app program (in Java, Python, or other language) to insert into Salesforce databases.
Excel Connector
Because Salesforce generates .csv files that Microsoft Excel and Google Sheet can read, many think that Salesforce should also read .csv files as input for import as well, in a “round-trip”.
The Force.com Excel Connector is an Add-on to Microsoft Excel via the Toolkit for Office. It promises bi-directional access to the Force.com API.
So it’s useful for cleaning and mass-updating salesforce.com-based data.
Updated features include access to Products2 and custom objects, API names or labels, simple query wizard, readable user names, etc.
It allows you to upload and export data directly in and out of an excel sheet.
QUESTION: What abut web-based (SaaS) Microsoft 360? or Google Sheets?
Wizard vs DataLoader
Based on VIDEO: Data Import: Choosing the Right Tool:
Data Import Wizard | Apex Data Loader | |
---|---|---|
Import data format | CSV | CSV |
Max. # records | < 50,000 | < 5 million |
Update existing records | yes | no |
Delete existing records | no | yes |
Catch duplicates | yes | no |
Import Opportunities | no | yes |
Can turn off Workflow rules | yes | no |
Can export data | no | yes |
Can save mappings | no | yes |
Both can handle custom as well as standard objects.
Both trigger validation rules during importing.
In-built Data Import Wizard
The Data Import Wizard is inbuilt within Salesforce to provide step-by-step guidance to manually specify how to get CSV files into Salesforce standard objects such as Leads, Accounts, Contacts, Solutions and custom objects.
Based on this article:
-
Create a dedicated Data Loader Salesforce user account.
-
Login by appending the Security Token to extend the account’s Password.
-
Set that account’s User Profile to not fire triggers, workflows, or other automation.
-
Setup that user whenever you setup automation.
-
Create a checkbox fields on the user record labeled “Authorized to Send Email Alerts”. Include this field in your workflow rules, triggers or other automation, similar to option 1.
-
Toggle the flag on/off depending on the type of data load.
-
Download the DataLoader.
-
Have a program calculate the number of records in the file. Salesforce limits imports of up to 50,000 records at a time. Use this Linux command:
wc -l importfile.csv
-
Before each upload, run a report to count the number of records, so you have a “before” picture.
-
Do a test run with just a couple records, so it’s easier to recover from mistakes.
-
After an upload, run a report to calculate records and identify what has changed.
In-built Apex Data Loader
The standard Salesforce Data Loader is for migrating CSV datasets into Salesforce, up to 5 million records at a time.
But recurring data loads can be scheduled. It is a program that communicates with the Salesforce cloud, installed locally on a laptop or on-premise server. Some consider it annoying to install on a Mac.
Permissions?
It can match external identifier keys. But its Auto Mappings is tricky to define, which can lead to wrong mappings or missing fields.
3rd-Party
Several partners of Salesforce offer tools.
Own backup
ownbackup.com is the premier solution.
DataLoader.io
Dataloader.io from AppExchange is a popular tools because it works on the browser (“no software”), no installation It and Salesforce security token required. It’s intuitive and wizard driven, but takes a little longer to setup and runs slower than the Salesforce Apex Data Loader.
However, unlike the Apex Data Loader, it handles CSV files in DropBox, Box, SFTP, FTP. This app also provides users with list of fields from related objects, making exports easier and cuts down post-export work required to bring multi-object data into a single row.
Scheduling jobs?
Jitterbit ODBC/JDBC Data Loader
Jitterbit from AppExchange is a free program installed on Windows and macOS. It is offered by Jitterbit Inc. as an entry offering to their full integration suite.
It is intended for tech savvy users who operate local databases (such as MySQL) and want to connect them to Salesforce. Its specialty is loading not just csv flat files but running ETL scripts that use ODBC/JDBC.
Thus, its meant to process repetitive runs and not one-offs.
Informatica Cloud Data Loader
Informatica’s Cloud Data Loader from AppExchange provides powerful data transformation logic during loading “on the fly”, which enables more powerful formatting of data fields.
It comes with prebuilt integration templates for systems such as NetSuite or SAP;
It is part of a very sophisticated integration suite called “built-in intelligence”. So it’s for tech-saavy types.
It connects to Box.
Pity it only handles CSV files.
Vlocity
Vlocity’s DataRaptor is an extract, transform and load (ETL) tool that integrates with their Omniscripts to read and write Salesforce data. The DataRaptor Designer enables app developers to map data to the input format required by OmniScripts, to transform the data as required by business logic, and to write the output data back to Salesforce in compliance with the Salesforce object model.
Talend
Talend Components for Salesforce Data Integration integrates with Talend’s Open Studio and Integration Suite, which Forrester put at the top of all other vendors in both current offering and Strategy dimensions for “Data Integration Tools”.
The tool is open-source (free). It works on Hadoop and Spark big data. It schedules data loads from other systems, using the Salesforce API, to perform system integrations.
In addition to Input and BulkExec connectors, Talend has more sophisticated connectors such as “GetUpdated” which changes data.
LexiLoader
http://macappstore.org/lexiloader says does not exist.
BOFC (Bulk Object Field Creator)
Import & Export Salesforce fields using CSV or XLSX files to perform bulk CRUD operations with Point & Click. from tech9logy.com
Grabs your Salesforce Process builder flow into excel.
See http://salesforcebofc.com/
Apsona
Apsona is a set of SaaS-based tools for Salesforce.
https://apsona.com/pages/sfdc/nonprofits.html
DemandTools
SOQL Studio from Visual Software systems
SOQL Studio from Visual Software systems is a $50/year IDE to query, visualize and extract Salesforce data at a depth and breadth of features not available with the force.com IDE, Developer Console, Workbench, or Data Loader. The tool returns every data point – aggregates, compound fields, related records, multi-level child-to-parent fields. It can work with multiple queries at the same time. It allows for annotation of queries with single or multi-line comments. Queries can be saved for reuse later. Copy/Paste selected bits of you results or export the full results of your SOQL query in Excel, CSV, XML or custom text format, even if your query includes data from child objects.
Reflection Enterprise
reflectionenterprise.com/salesforce-data-export
JetBrains Illuminated Cloud
The Illuminated Cloud @IllumCloud) add-in to JetBrains’ IntelliJ IDE.
It supports multiple Dev Hubs when creating scratch orgs, which is nice when working on packages in different dev hubs.
CloudToolKit
Ben Edwards (@benedwards44 in New Zealand) https://cloudtoolkit.co (salesforcetoolkit.com and sftoolkit.co) open sourced in GitHub a suite of tools running on Heroku:
-
Schema Lister using the Salesforce Metadata API to build a list of objects, fields, and field attributes from within a Salesforce Org.
-
sforgcompare which uses the Salesforce Tooling or Metadata API (user selection) to compare metadata between two Orgs – useful when planning deployments or building deployment packages, as well as seeing what configuration exists in Production to Sandbox or between Sandbox environments.
-
sfcodeclean scans all the (non-packaged) Apex Classes in your Org to build a table of where each property, variable and method is used. Useful for cleaning up unused code. Only Apex Class metadata is captured from your Org.
-
Switch which provides a web interface to easily enable and disable components in your Salesforce Org - Workflows, Triggers and Validation Rules. Very useful when doing data migrations and needing to disable certain automation in Salesforce.
-
packagebuilder which uses the Salesforce Metadata API to build a package.xml file based on components in a Salesforce org specified online. Handy for building a package.xml file for use with ANT Force.com Migration or other IDE tools, or building a destructiveChanges.xml file.
Others:
- Web Form
- Email to Case
- Email to database
Java Apex Code
A java program is more work, but much more flexible, operating at the “atomic” level, particularly with junction objects.
Techniques from the Big Objects Implementation Guide include Async calls of Salesforce Object Query Language in the background.
In the Apex Developer Guide: Inserting and Updating Records:
Using DML, you can insert new records and commit them to the database. Similarly, you can update field values of existing records.
First make a query to obtain related keys used in the query.
try { Account acct = new Account(Name='SFDC Account'); insert acct; // Once the account is inserted, the sObject will be // populated with an ID. // Get this ID. ID acctID = acct.ID; // Add a contact to this account. Contact con = new Contact( FirstName='Joe', LastName='Smith', Phone='415.555.1212', AccountId=acctID); insert con; } catch(DmlException e) { System.debug('An unexpected error has occurred: ' + e.getMessage()); }
The above example inserts three account records and updates an existing account record. First, three Account sObjects are created and added to a list. An insert statement bulk inserts the list of accounts as an argument. Then, the second account record is updated, the billing city is updated, and the update statement is called to persist the change in the database.
References:
-
Java developer resources at http://wiki.developerforce.com/page/Java
-
https://github.com/forcedotcom/CustomMetadataLoader provides an open-source tool to help users bulk create and update custom metadata records in salesforce.com from a CSV file.
REST API calls externally
The advantage of a REST API interface is scale.
- Several instances can update Salesforce at the same time.
- The interface can update various orgs.
- The interface can interact with resources outside Salesforce
Internally, Salesforce tables contain artificial keys call “identifiers” which uniquely identify each record. That identifier is created when new data is added. In parent-child relationships, child tables contain the identifier to its parent. See https://developer.salesforce.com/docs/atlas.en-us.218.0.api_asynch.meta/api_bulk_v2/datafiles_xml_rel_fields.htm?search_text=relationships
Rows created with data external to Salesforce contain an external_id.
That external_id is in CSV files created by the export process.
A program that inserts data based on CSV files cannot use the parent identifer exported because as each row is inserted, a new identifier is created.
Thus, CSV files need to be imported in a specific sequence – parent first, then its children. See Creating Parent and Child Records in a Single Statement Using Foreign Keys and Relationships It has a section on Async API Code walkthough
In Web Services Connector, the “Preparing to Integrate Java Apps with Force.com APIs” and “Creating an Enterprise WSDL Application” sections where it walks through setup, prep, and actually gives you some sample code that shows how to query, update, create, and delete records.
developer-centers/integration-apis (previously https://developer.force.com/REST)
https://developer.salesforce.com/docs/atlas.en-us.218.0.api.meta/api/sforce_api_calls_create.htm#MixedSaveSection
SaveResult[] = connection.create(sObject[] sObjects);
Use create()
to add one or more records, such as an Account or Contact record, to an organization’s information. The create() call is analogous to the INSERT statement in SQL.
When creating objects, consider these rules and guidelines.
- https://github.com/danieljpeter/salesforceMetadataBackup gets all salesforce metadata via rest api and apache ant and push to github
GitHub / Tools
https://github.com/danieljpeter/HyperBatch by MVP DanielJPeter
https://github.com/fredrikhogstrom/salesforce-test-data-creation by fredrikhogstrom provides NodeJs code to create test data for Salesforce unit tests, great for when in an unfamiliar environment. It uses multiple API’s to get the data.
Learning Modules
This 5 part video series walks you through all aspects of data import, from preparing your import files to matching owner and parent record IDs, on both Classic and Lightning.
This 3-part video series TC: SALESFORCE DATA BACKUP AND DISASTER RECOVERY OPTIONS [SP]
Project: Import and Export with Data Management Tools +300 to use Data Loader and the Data Import Wizard to manage data in Salesforce.
-
Data Management +550 Learn how to import and export data in Salesforce.
-
SalesforceIQ to Sales Cloud Setup +1000 to optimize your new Sales Cloud org for your growing business.
-
Data Modeling +1500 to give your data structure with objects, fields, and relationships.
-
Large Data Volumes +400 Plan for and work with large data volumes within Salesforce.
-
Data Migration and Service Cloud Setup +200 to move your data from Desk.com to Service Cloud and perform your initial setup.
-
Custom Metadata Types +1400 Learn how to customize, deploy, package, and upgrade application metadata with ease.
-
Account Data Strategies +200 Learn how your sales team can use data to close deals.
-
Lightning Data Service Basics +1100 Learn about force:recordData, its benefits, and how to use it in your components.
References
-
“Comparing Data Loaders for Salesforce” (2013) by Roger Mitchell
-
Best Data loader for Salesforce” (2014) by Ben McCarthy
VIDEO: How to: Use Two Great (and Free) Data Tools - Data Loader and Excel Connector
SFDX Sandbox Data Loads from SnowForce SLC Apr 24 2019 [48m 14s] by
Randy Barton (“SFDX Baby Steps - Using sandboxes instead of scratch orgs”)
For those who have bought Mike Wheeler’s Udemy class:
More about Salesforce
This is one of a series about Salesforce
- Salesforce Ohana (about the Salesforce company, offices, mascots, emojis, and store)
- Salesforce Glossary (of acronyms)
- Salesforce Events (Conferences, local Meetups, ) to meet people face-to-face
- Salesforce Exhibitors (at Dreamforce)
- Salesforce Onboarding (Trailhead and IDEs)
- Salesforce Offerings (Clouds, Industries, Domains, GitHub, editions, pricing, features, versions)
- Salesforce Certifications (training and exams)
- Salesforce Projects, Superbadges, and Sample Apps
- Salesforce Project Plans
- Salesforce Jobs (within Salesforce, with partners, etc.)
- Salesforce Apps (in AppExchange)
- Salesforce Alexa
- Salesforce Heroku (external apps)
- Salesforce Non-Profit support
-
Salesforce NPSP (Non-Profit Success Pack) performance (with Gatling)
- Salesforce Data Management
- Salesforce Einstein
- Salesforce Selenium (test automation)