Selecting the right Data Integration tool

di-for-everyone

What is Data Integration?

Data integration is the process of retrieving data from multiple source systems and combining it in such a way that it can yield consistent, comprehensive, current and correct information for business reporting and analysis.

Data integration is challenging. The number of sources and types of data continue to grow, and the data is often autonomous and may be in a variety of formats.

Is Data Integration a necessity?

Just think about it, the data integration market is expected to grow from USD 6.44 Billion in 2017 to USD 12.24 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 13.7%.

The major factor driving this market is the high demand for tools that can combine numerous heterogeneous data sources, enabling users to get a consolidated view of data and extract valuable business insights.

The growing importance of data analytics and BI applications in taking strategic business decision has made data integration’s role crucial. From collecting data, transforming it into useful insights and delivering it to the users require effective data integration tools.

How do you find the right Data Integration tool?

It is not an easy task to go and find the best tool out there.

“The goal is to turn data into information, and information into insight.”

Once you figure out what are your needs and what projects you want to start with, you can start doing your own research by googling data integration (ETL) tools or asking around what others have been using, what they would use, etc.

Things to consider while choosing an ETL tool:

  • Price: some ETL tools are free, but there are hidden costs, and actual costs. Prices vary according to usage. The more money you spend, the lower the unit price. So most vendors have tailored prices, very few have standard prices across all usages/volumes. The key point here is to know your limits.
  • Scalability: as data sources, volumes, and other complexities increase, scaling and managing ETL process becomes increasingly difficult. ETL tools, especially cloud-based ETL tools, remove this obstacle as they scale as your needs grow.
  • Data Sources: an ability to connect to the sources that you need now and could potentially want. Think of it as an investment in your data journey.
  • Simplicity: the end users must be able to devote a little extra time to learn their way around point-and-click interfaces in an ETL tool so that they are the primary ones in charge of the tool. With cloud-based ETL tools, one tool can be used to manage the entire process, reducing extra layers of dependencies.
  • Real-time: building a real-time ETL process manually is a challenge. With ETL tools handling this for you, having real-time data at your fingertips, from sources throughout the organization, becomes a lot easier.
  • Maintenance: instead of your development team constantly fixing bugs and errors, making use of ETL tools means that maintenance is handled automatically, as patches and updates propagate seamlessly and automatically.
  • Security: the chosen ETL tool must have high-security standards and ensure that you are on the right side of compliance.

There are a broad variety of ETL tools available, each with its own advantages and disadvantages. Gaining an understanding of these differences can help you choose the best ETL tool for your needs.

Data Integration can save the day

It is always a pleasure to get feedback from happy customers – and the road to customer satisfaction in the data integration business is long. To ensure satisfaction, we work with you closely from the very beginning, when you just research your different alternatives.

A data integration project requires commitment and great communication from both parties – when it is given, investing in an integration platform can be the best decision you have ever made.

We provide you with the product and solution demos, explain core features of the platform, answer your technical team’s questions, scope the first pilot together, define how we should move forward from there and the goal is always to develop the best possible solution.

Ready to get started?

Contact us  or request a demo to get your data integration project up and running in minutes.

Etlworks Marketo Integration

etlworks-marketo-data-integration

What is Marketo?

Marketo is a cloud-lead management and marketing solution. The product range of Marketo is provided on a subscription basis and covers Lead Management, Sales Insights, Revenue Cycle Analytics and Social Marketing applications. It helps organizations automate and measure marketing engagement, tasks, and workflows, including those for email, mobile, social, and digital ads.

What is Etlworks?

Etlworks is a cloud-native integration platform helps businesses automate manual data management tasks, ensure data that are far more accurate, accelerate business workflows, and provide greater operational visibility to an organization.

After a few minutes setup, Etlworks replicates all your applications, databases, events and files into a high-performance data warehouse like Snowflake or Amazon Redshift, so that you can then use your favorite BI or analytics tools. Create reports, monitor custom dashboards, and more instantly from the cloud.

Connect Marketo to Anything

Etlworks offers connectivity to Marketo’s APIs enabling you to work with key Marketo entities including Lead, Activity, List, Opportunity, OpportunityRole as well as Custom Objects. Etlworks exposes both the SOAP and REST APIs for Marketo ensuring you can handle any integration task.

Use the Etlworks Marketo connector for data integration between Marketo and your CRM system, such as Salesforce, MS Dynamics, SugarCRM, HubSpot, and NetSuite; collaboration or survey tools; webinar platforms; data services; marketing databases; and more.

Etlworks Marketo connector free you to focus on insights, so your company will be faster and more efficient at optimizing your marketing performance and improving your campaigns’ ROI.

Etlworks partnered with CData to provide access to the Marketo API using industry standard JDBC protocol.

Let’s do it!

Connecting to Marketo

Step 1. Obtaining the OAuthClientId and OAuthClientSecret Values. To obtain the OAuthClientIdand OAuthClientSecret, navigate to the LaunchPoint option on the Admin area. Click the View Details link for the desired service. A window containing the authentication credentials is displayed.

Step 2. Obtaining the REST Endpoint URL. The RESTEndpoint can be found on your Marketo Admin area on the Integration -> Web Services option in the REST API section. Note the Identity Endpoint will not be needed.

Step 3. Enable Marketo connector for your Etlworks account. Contact support@etlworks.com to enable connector.

Step 4. Create a Marketo connection to work with data in Marketo.

Stored Procedures

Stored Procedures are available to complement the data available from the REST Data Model. Sometimes it is necessary to update data available from a view using a stored procedure because the data does not provide for direct, table-like, two-way updates. In these situations, the retrieval of the data is done using the appropriate view or table, while the update is done by calling a stored procedure. Stored procedures take a list of parameters and return back a dataset that contains the collection of tuples that constitute the response.

To call stored procure from the SQL flow or from Before/After SQL use EXEC sp_name params=value syntax. Example:

EXEC SelectEntries ObjectName = 'Account'

Extracting data from Marketo

Note: extracting data from Marketo is similar to extracting data from the relational database.

Step 1. Create a Marketo connection which will be used as a source (FROM).

Step 2. Create a destination connection, for example, a connection to the relational database, and if needed a format (format is not needed if the destination is a database or well-known API).

Step 3. Create a flow where the source is a database and the destination is a connection created in step 2, for example, relational database.

mceclip0

Step 4. Add new source-to-destination transformation.

Step 5. Select Marketo connection created in step 1 as a source connection and select the Marketo object you are extracting data from:mceclip0 (1)

Step 6. Select TO connection, format (if needed) and object (for example database table) to load data into.

mceclip3

Step 7. Click MAPPING and optionally enter Source Query (you don’t need a query if you are extracting data from the Marketo object unconditionally).

Step 8. Optionally define the per-field mapping.

salesforce-mapping (1)

Step 9. Add more transformations if needed.

Loading data in Marketo

Note: loading data in Marketo is similar to loading data into relational database.

Step 1. Create a source connection and a format (if needed).

Step 2. Create destination Marketo connection.

Step 3. Create a flow where the destination is a database.

Step 4. Add new source-to-destination transformation.

Step 5. Select FROM and TO connections and objects (also a FROM format if needed).

mceclip5

Step 6. Optionally define the per-field mapping.

Step 7. Add more transformations if needed.

Browsing data in Marketo

You must have a Marketo connection to browse objects and run SQL queries.

Use Explorer to browse data and metadata in Marketo as well as execute DML and SELECT queries against Marketo connection.

mceclip4

Ready to get started?

Contact Etlworks today to connect your Marketo instance with Etlworks and unlock the ability to read and replicate many of the objects to your data destination.

ETL/ELT all your data into Amazon Redshift DW

amazon_integration

Amazon Redshift is fast, scalable, and easy-to-use, making it a popular data warehouse solution. Redshift is straightforward to query with SQL, efficient for analytical queries and can be a simple add-on for any organization operating its tech stack on AWS.

Amazon Web Services have many benefits. Whether you choose it for the pay as you go pricing, high performance, and speed or its versatile and flexible services provided, we are here to present you the best data loading approaches that work for us.

Etlworks allows users to load your data from cloud storages and APIs, SQL and NoSQL databases, web services to Amazon Redshift data warehouse in a few simple steps. You can configure and schedule the flow using intuitive drag and drop interface and let Etlworks do the rest.

Etlworks supports not just one-time data loading operation. It can help you to integrate your data sources with Amazon Redshift and automate updating your Amazon Redshift with fresh data with no additional effort or involvement!

Today we are going to examine how to load data into Amazon Redshift.

A typical Redshift flow performs the following operations:

  • Extract data from the source.
  • Create CSV files.
  • Compress files using the gzip algorithm.
  • Copy files into Amazon S3 bucket.
  • Check to see if the destination Amazon Redshift table exists, and if it does not – creates the table using metadata from the source.
  • Execute the Amazon Redshift COPY command.
  • Clean up the remaining files.

There are some prerequisites have to be met, before you can design a flow that loads data into Amazon Redshift:

Now, you are ready to create a Redshift flow. Start by opening the Flows window, clicking the + button, and typing redshift into the search field:

redshift-flows

Continue by selecting the flow type, adding source-to-destination transformations and entering the transformation parameters:

redshift-transformation

You can select one of the following sources (FROM) for the Redshift flow:

  • API – use any appropriate string as the source (FROM) name
  • Web Service – use any appropriate string as the source (FROM) name
  • File – use the source file name or a wildcard filename as the source (FROM) name
  • Database – use the table name as the source (FROM) name
  • CDC – use the fully qualified table name as the source (FROM) name
  • Queue – use the queue topic name as the source (FROM) name

For most of the Redshift flows, the destination (TO) is going to be Amazon S3 connection. To configure the final destination, click the Connections tab and select the available Amazon Redshift connection.

redshift-connection

Amazon Redshift can load data from CSVJSON, and Avro formats but Etlwoks supports loading only from CSV so you will need to create a new CSV format and set it as a destination format. If you are loading large datasets into Amazon Redshift, consider configuring a format to split the document into smaller files. Amazon Redshift can load files in parallel, also transferring smaller files over the network can be faster.

If necessary, you can create a mapping  between the source and destination (Redshift) fields.

Mapping is not required, but please remember that if a source field name is not supported by Redshift, it will return an error and the data will not be loaded into the database. For example, if you are loading data from Google Analytics, the output (source) is going to include fields with the prefix ga: ( ga:user, ga:browser, etc. ). Unfortunately, Amazon Redshift does not support fields with a : , so the data will be rejected. If that happens, you can use mapping to rename the destination fields.

ELT for Amazon Redshift

Amazon Redshift provides affordable and nearly unlimited computing power which allows loading data to Amazon Redshift as-is, without pre-aggregation, and processing and transforming all the data quickly when executing analytics queries. Thus, the ETL (Extract-Transform-Load) approach transforms to ELT (Extract-Load-Transform). This may simplify data loading to Amazon Redshift greatly, as you don’t need to think about the necessary transformations.

Etlworks supports executing complex ELT scripts directly into Amazon Redshift which greatly improves performance and reliability of the data injection.

I hope this has been helpful. Go forth and load large amounts of data.

Why Etlworks?

If you are reading this, you have most likely already answered the question:

“why do I need a data integration tool anyway?”

The answer is probably any combination of the things below:

  • I’m tired of dealing with the complicated inputs and outputs and don’t want to write a custom code anymore.
  • I need a way to connect to all my data, regardless of its format and location.
  • I need a simple way to transform my data from one format to another.
  • I need to track changes in my transactional database and push them to my data warehouse.
  • I need to connect to the external and internal APIs with different authentication schemas, requests, and responses.
  • I want to create new APIs with just a few mouse clicks, without writing any code.
  • Generally speaking, I’d prefer not to write any code at all.
  • I don’t want to worry about backups, performance, and job monitoring. I want someone to manage my data integration tool for me.

If this sounds familiar, you are probably in the process of selecting the best tool for the task at hands. The truth is, most modern data integration tools are quite good and can significantly simplify your life.

Etlworks is so much better.

By selecting Etlworks as your data integration tool, you will be able to implement very complex data integration flows with fewer steps and faster.

The key advantages of the Etlworks:

  • It automatically parses even the most complicated JSON and XML documents (as well as other formats) and can connect to all your APIs and databases.
  • It is built for Cloud but works equally well when installed on-premise.
  • You can visualize and explore all your data, regardless of the format and location, before creating any data integration flows.
  • You probably won’t need to write any code at all, but even if you will, it will be just a few lines in the familiar programming language: SQL, JavaScript, and Python.
  • You can use SQL to extract and flatten data from nested documents.
  • Etlworks is a full-fledged enterprise service bus (ESB) so you can create data integration APIs with just a few mouse clicks.
  • Etlworks can integrate data behind the corporate firewall when working together with data integration agent. Read more in this blog post.
  • We provide world-class support for all customers.
  • Our service is so affordable that you won’t have to get a board approval in order to use it.
  • No sales call necessary! Just sign up and start using it right away.

Test drive Etlworks for 14 days free of charge.