Jump to content

User:Janst/sandbox

From Wikipedia, the free encyclopedia

EVL stands for Extract-Validate-Load and it is an Extract-Transform-Load (ETL) tool. It focuses on high performance processing and high speed of development, so it has no Graphical user interface (GUI).

EVL runs on Linux, but should also work on other Unix-like operating systems. However not possible to run on Windows operating system and, as authors says, never will.

Description

[edit]

As being an ETL, EVL can:

  • extract data from source files, DBMS, Hadoop, Kafka or simply any Linux command output;
  • cleanse, validate, transform, historize (SCD2) the data;
  • handle file registration i.e. store information regarding processing files;
  • provide data lineage information on the tables/files level;
  • load into files, DBMS, Hadoop, Kafka or simply into any Linux command input.

EVL philosophy

[edit]

As EVL is developed and runs on Linux mostly, it also kepps the Unix philosophy. As mentioned in the documentation[1], the philosophy is:

  • Components oriented with clear focus - “Do One Thing and Do It Well”.
  • Do not let simple things to get complex.
  • Keep good balance of robustness and functionality.
  • Templates and variables oriented - high level of abstraction.

EVL Job Manager

[edit]

We can split ETL processing into these three main logical parts:

  1. Job Scheduler, which simply fire a command at given time, like Cron does.
  2. Job Manager, which manage jobs consequencies, relations, wait for a file to delivered, etc.
  3. ETL tool itself.

Lots of schedulers are trying to be also job managers in this manner. As Cron is well enough for launching shell scripts at given time and date, lean job manager is included to EVL Tool, named EVL Job Manager[2].

References

[edit]
[edit]

Category:Data warehousing products Category:Extract, transform, load tools