Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Direct Path Loading #193

Open
ceyhunkerti opened this issue Jan 24, 2025 · 5 comments
Open

Support for Direct Path Loading #193

ceyhunkerti opened this issue Jan 24, 2025 · 5 comments

Comments

@ceyhunkerti
Copy link

Hello, are there any plans to support direct path loading?

@cjbj
Copy link
Member

cjbj commented Jan 25, 2025

@ceyhunkerti nothing in the immediate plans. In some discussion long ago there was a thought that it has some limited, targeted use cases, but we would need to evaluate this.

Can you provide info about what you want to use it for (data sizes & types, table types & indexes, etc etc)?

@cjbj cjbj added enhancement and removed question labels Jan 25, 2025
@ceyhunkerti
Copy link
Author

ceyhunkerti commented Jan 25, 2025

@cjbj yes I searched the issues before and saw some discussion dating to several years ago. Also same question is asked in python/cx_oracle long ago.

Here is my thoughts and what I need it for If you bear with me:

  • I am trying to create a data transfer utility in Zig and using odpi-c for oracle connection
  • Will support different databases and just prototyped oracle side first.
  • But from my prior experience sqlldr is the fastest way to load data in most of the cases (ETL loads) which uses direct path loading.
  • So, I also want to enable users to opt in for direct path in my run configuration.
  • I also tried to switch to ocilib since it has DPL support, although it starts smooth I had some troubles with it later on. Also the owner of ocilib repo addresses some of these in here to you.
  • Lastly I've been using data transfer tools/methods for years and nowadays since python is very popular, tools in the singer family (there are many with the same logic) are using cx_Oracle hence odpi-c and they don't support DPL.

Thanks for reading so far :)

Now I have two options:

  • I'll fallback to OCI which needs huge boilerplate.
  • Skip the DPL support and move on.

I also have a third option in my mind but I don't know how to do it so far. Which is;

  • Use ODPI and implement the DPL part in zig
    but didn't yet see how can I access lets say oci handles like env, conn, etc from dpi_struct.

And maybe

  • Patch odpi-c in my repo to include DPL

@tgulacsi
Copy link

I'd interested in Direct Load Path it it's faster than OCIExecuteMany.
(csvload, import from Parquet)

@cjbj
Copy link
Member

cjbj commented Jan 27, 2025

What sort of data size are you (all) using?

@ceyhunkerti
Copy link
Author

For me;
I don't need it for a specific data size range. it'll be an option in my program for large data sizes let's say TB scale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants