required_resource_keys: A set of resource references that are required by the op.It is much cleaner this way – but as we will see, we have to pay a price, again! When we take a closer look at our asset decorator, we can see that we pass 3 arguments to it: Our assets focus only on the business logic and don’t care where our data is going or what happens to our data afterward. The business logic looks also very similar – but wait, where is the logic that stores our raw data into a CSV file? We will explore this in a minute. Well, of course, we use the asset decorator instead of the task decorator. Thus, we create a resource.py file in the franchise_blog directory and insert the following code: We will see at a later point how Dagster handles these technicalities.įirst of all, we need to define two resources: a path to our base directory where we will store our assets, and a Postgres API resource. This shift in paradigm is very interesting since very often, we deal with technicalities when using data orchestration tools but actually, we should focus more on our assets. Thus, Dagster focuses on files, tables, machine learning models, and so forth. Contrary to Prefect and Airflow, Dagster follows an asset-centric paradigm. The first concept where we want to dig in is resources. Anyway, let’s start implementing our pipeline in Dagster! Resources So both tools make it easy for us to do local development. Prefect isolated the pipeline code by using Deployments. This approach is quite favorable since every pipeline can be put in isolated packages. The flag -e tells pip to automatically apply local code changes.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |