Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADBC-15 #110

Merged
merged 7 commits into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions 3rd_party/apache-arrow-adbc/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -699,3 +699,63 @@
- **ci**: update website_build.sh for new versioning scheme (#1972)
- **dev/release**: update C# <VersionSuffix> tag (#1973)
- **c/vendor/nanoarrow**: Fix -Wreorder warning (#1966)

## ADBC Libraries 15 (2024-11-08)

### Versions

- C/C++/GLib/Go/Python/Ruby: 1.3.0
- C#: 0.15.0
- Java: 0.15.0
- R: 0.15.0
- Rust: 0.15.0

### Feat

- **c/driver/postgresql**: Enable basic connect/query workflow for Redshift (#2219)
- **rust/drivers/datafusion**: add support for bulk ingest (#2279)
- **csharp/src/Drivers/Apache**: convert Double to Float for Apache Spark on scalar conversion (#2296)
- **go/adbc/driver/snowflake**: update to the latest 1.12.0 gosnowflake driver (#2298)
- **csharp/src/Drivers/BigQuery**: support max stream count setting when creating read session (#2289)
- **rust/drivers**: adbc driver for datafusion (#2267)
- **go/adbc/driver/snowflake**: improve GetObjects performance and semantics (#2254)
- **c**: Implement ingestion and testing for float16, string_view, and binary_view (#2234)
- **r**: Add R BigQuery driver wrapper (#2235)
- **csharp/src/Drivers/Apache/Spark**: add request_timeout_ms option to allow longer HTTP request length (#2218)
- **go/adbc/driver/snowflake**: add support for a client config file (#2197)
- **csharp/src/Client**: Additional parameter support for DbCommand (#2195)
- **csharp/src/Drivers/Apache/Spark**: add option to ignore TLS/SSL certificate exceptions (#2188)
- **csharp/src/Drivers/Apache/Spark**: Perform scalar data type conversion for Spark over HTTP (#2152)
- **csharp/src/Drivers/Apache/Spark**: Azure HDInsight Spark Documentation (#2164)
- **c/driver/postgresql**: Implement ingestion of list types for PostgreSQL (#2153)
- **csharp/src/Drivers/Apache/Spark**: poc - Support for Apache Spark over HTTP (non-Arrow) (#2018)
- **c/driver/postgresql**: add `arrow.opaque` type metadata (#2122)

### Fix

- **csharp/src/Drivers/Apache**: fix float data type handling for tests on Databricks Spark (#2283)
- **go/adbc/driver/internal/driverbase**: proper unmarshalling for ConstraintColumnNames (#2285)
- **csharp/src/Drivers/Apache**: fix to workaround concurrency issue (#2282)
- **csharp/src/Drivers/Apache**: correctly handle empty response and add Client tests (#2275)
- **csharp/src/Drivers/Apache**: remove interleaved async look-ahead code (#2273)
- **c/driver_manager**: More robust error reporting for errors that occur before AdbcDatabaseInit() (#2266)
- **rust**: implement database/connection constructors without options (#2242)
- **csharp/src/Drivers**: update System.Text.Json to version 8.0.5 because of known vulnerability (#2238)
- **csharp/src/Drivers/Apache/Spark**: correct batch handling for the HiveServer2Reader (#2215)
- **go/adbc/driver/snowflake**: call GetObjects with null catalog at catalog depth (#2194)
- **csharp/src/Drivers/Apache/Spark**: correct BatchSize implementation for base reader (#2199)
- **csharp/src/Drivers/Apache/Spark**: correct precision/scale handling with zeros in fractional portion (#2198)
- **csharp/src/Drivers/BigQuery**: Fixed GBQ driver issue when results.TableReference is null (#2165)
- **go/adbc/driver/snowflake**: fix setting database and schema context after initial connection (#2169)
- **csharp/src/Drivers/Interop/Snowflake**: add test to demonstrate DEFAULT_ROLE behavior (#2151)
- **c/driver/postgresql**: Improve error reporting for queries that error before the COPY header is sent (#2134)

### Refactor

- **c/driver/postgresql**: cleanups for result_helper signatures (#2261)
- **c/driver/postgresql**: Use GetObjectsHelper from framework to build objects (#2189)
- **csharp/src/Drivers/Apache/Spark**: use UTF8 string for data conversion, instead of .NET String (#2192)
- **c/driver/postgresql**: Use Status for error handling in BindStream (#2187)
- **c/driver/postgresql**: Use Status instead of AdbcStatusCode/AdbcError in result helper (#2178)
- **c/driver**: Use non-objects framework components in Postgres driver (#2166)
- **c/driver/postgresql**: Use copy writer in BindStream for parameter binding (#2157)
6 changes: 3 additions & 3 deletions 3rd_party/apache-arrow-adbc/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ https://github.com/apache/arrow-adbc/issues
Some dependencies are required to build and test the various ADBC packages.

For C/C++, you will most likely want a [Conda][conda] installation,
with [Mambaforge][mambaforge] being the most convenient distribution.
If you have Mambaforge installed, you can set up a development
with [Miniforge][miniforge] being the most convenient distribution.
If you have Miniforge installed, you can set up a development
environment as follows:

```shell
Expand All @@ -52,7 +52,7 @@ CMake or other build tool appropriately. However, we primarily
develop and support Conda users.

[conda]: https://docs.conda.io/en/latest/
[mambaforge]: https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html
[miniforge]: https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html

### Running Integration Tests

Expand Down
2 changes: 1 addition & 1 deletion 3rd_party/apache-arrow-adbc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,4 @@ User documentation can be found at https://arrow.apache.org/adbc

## Development and Contributing

For detailed instructions on how to build the various ADBC libraries, see CONTRIBUTING.md.
For detailed instructions on how to build the various ADBC libraries, see [CONTRIBUTING.md](CONTRIBUTING.md).
4 changes: 0 additions & 4 deletions 3rd_party/apache-arrow-adbc/c/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,6 @@ add_subdirectory(vendor/nanoarrow)
add_subdirectory(driver/common)
add_subdirectory(driver/framework)

install(FILES "${REPOSITORY_ROOT}/c/include/adbc.h" DESTINATION include)
install(FILES "${REPOSITORY_ROOT}/c/include/arrow-adbc/adbc.h"
DESTINATION include/arrow-adbc)

if(ADBC_BUILD_TESTS)
add_subdirectory(validation)
endif()
Expand Down
13 changes: 4 additions & 9 deletions 3rd_party/apache-arrow-adbc/c/apidoc/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -500,7 +500,7 @@ EXTRACT_ALL = NO
# be included in the documentation.
# The default value is: NO.

EXTRACT_PRIVATE = NO
EXTRACT_PRIVATE = YES

# If the EXTRACT_PRIV_VIRTUAL tag is set to YES, documented private virtual
# methods of a class will be included in the documentation.
Expand Down Expand Up @@ -891,7 +891,7 @@ WARN_LOGFILE =
# spaces. See also FILE_PATTERNS and EXTENSION_MAPPING
# Note: If this tag is empty the current directory is searched.

INPUT = ../../c/include/arrow-adbc/adbc.h ../../README.md ../../c/include/arrow-adbc/adbc_driver_manager.h
INPUT = ../../c/include/arrow-adbc/adbc.h ../../README.md ../../c/include/arrow-adbc/adbc_driver_manager.h ../../c/driver/framework/

# This tag can be used to specify the character encoding of the source files
# that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses
Expand Down Expand Up @@ -920,12 +920,7 @@ INPUT_ENCODING = UTF-8
# comment), *.py, *.pyw, *.f90, *.f95, *.f03, *.f08, *.f18, *.f, *.for, *.vhd,
# *.vhdl, *.ucf, *.qsf and *.ice.

FILE_PATTERNS = *.c \
*.cc \
*.cxx \
*.cpp \
*.c++ \
*.java \
FILE_PATTERNS = *.java \
*.ii \
*.ixx \
*.ipp \
Expand Down Expand Up @@ -1007,7 +1002,7 @@ EXCLUDE_PATTERNS =
# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories use the pattern */test/*

EXCLUDE_SYMBOLS =
EXCLUDE_SYMBOLS = ADBC ADBC_DRIVER_MANAGER_H

# The EXAMPLE_PATH tag can be used to specify one or more files or directories
# that contain example code fragments that are included (see the \include
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
# ------------------------------------------------------------
# Version definitions

set(ADBC_VERSION "1.2.0")
set(ADBC_VERSION "1.3.0")
string(REGEX MATCH "^[0-9]+\\.[0-9]+\\.[0-9]+" ADBC_BASE_VERSION "${ADBC_VERSION}")
string(REPLACE "." ";" _adbc_version_list "${ADBC_BASE_VERSION}")
list(GET _adbc_version_list 0 ADBC_VERSION_MAJOR)
Expand Down
13 changes: 9 additions & 4 deletions 3rd_party/apache-arrow-adbc/c/driver/bigquery/bigquery_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,18 @@
// specific language governing permissions and limitations
// under the License.

#include <algorithm>
#include <cstring>
#include <random>
#include <thread>

#include <arrow-adbc/adbc.h>
#include <gmock/gmock-matchers.h>
#include <gtest/gtest-matchers.h>
#include <gtest/gtest-param-test.h>
#include <gtest/gtest.h>
#include <nanoarrow/nanoarrow.h>
#include <algorithm>
#include <cstring>
#include <random>

#include "validation/adbc_validation.h"
#include "validation/adbc_validation_util.h"

Expand Down Expand Up @@ -120,7 +123,9 @@ class BigQueryQuirks : public adbc_validation::DriverQuirks {
create += "` (int64s INT, strings TEXT)";
CHECK_OK(AdbcStatementSetSqlQuery(&statement.value, create.c_str(), error));
CHECK_OK(AdbcStatementExecuteQuery(&statement.value, nullptr, nullptr, error));
sleep(5);
// XXX: is there a better way to wait for BigQuery? (Why does 'CREATE
// TABLE' not wait for commit?)
std::this_thread::sleep_for(std::chrono::seconds(5));

std::string insert = "INSERT INTO `ADBC_TESTING.";
insert += name;
Expand Down
66 changes: 2 additions & 64 deletions 3rd_party/apache-arrow-adbc/c/driver/common/utils.c
Original file line number Diff line number Diff line change
Expand Up @@ -235,70 +235,8 @@ struct AdbcErrorDetail CommonErrorGetDetail(const struct AdbcError* error, int i
};
}

struct SingleBatchArrayStream {
struct ArrowSchema schema;
struct ArrowArray batch;
};
static const char* SingleBatchArrayStreamGetLastError(struct ArrowArrayStream* stream) {
(void)stream;
return NULL;
}
static int SingleBatchArrayStreamGetNext(struct ArrowArrayStream* stream,
struct ArrowArray* batch) {
if (!stream || !stream->private_data) return EINVAL;
struct SingleBatchArrayStream* impl =
(struct SingleBatchArrayStream*)stream->private_data;

memcpy(batch, &impl->batch, sizeof(*batch));
memset(&impl->batch, 0, sizeof(*batch));
return 0;
}
static int SingleBatchArrayStreamGetSchema(struct ArrowArrayStream* stream,
struct ArrowSchema* schema) {
if (!stream || !stream->private_data) return EINVAL;
struct SingleBatchArrayStream* impl =
(struct SingleBatchArrayStream*)stream->private_data;

return ArrowSchemaDeepCopy(&impl->schema, schema);
}
static void SingleBatchArrayStreamRelease(struct ArrowArrayStream* stream) {
if (!stream || !stream->private_data) return;
struct SingleBatchArrayStream* impl =
(struct SingleBatchArrayStream*)stream->private_data;
impl->schema.release(&impl->schema);
if (impl->batch.release) impl->batch.release(&impl->batch);
free(impl);

memset(stream, 0, sizeof(*stream));
}

AdbcStatusCode BatchToArrayStream(struct ArrowArray* values, struct ArrowSchema* schema,
struct ArrowArrayStream* stream,
struct AdbcError* error) {
if (!values->release) {
SetError(error, "ArrowArray is not initialized");
return ADBC_STATUS_INTERNAL;
} else if (!schema->release) {
SetError(error, "ArrowSchema is not initialized");
return ADBC_STATUS_INTERNAL;
} else if (stream->release) {
SetError(error, "ArrowArrayStream is already initialized");
return ADBC_STATUS_INTERNAL;
}

struct SingleBatchArrayStream* impl =
(struct SingleBatchArrayStream*)malloc(sizeof(*impl));
memcpy(&impl->schema, schema, sizeof(*schema));
memcpy(&impl->batch, values, sizeof(*values));
memset(schema, 0, sizeof(*schema));
memset(values, 0, sizeof(*values));
stream->private_data = impl;
stream->get_last_error = SingleBatchArrayStreamGetLastError;
stream->get_next = SingleBatchArrayStreamGetNext;
stream->get_schema = SingleBatchArrayStreamGetSchema;
stream->release = SingleBatchArrayStreamRelease;

return ADBC_STATUS_OK;
bool IsCommonError(const struct AdbcError* error) {
return error->release == ReleaseErrorWithDetails || error->release == ReleaseError;
}

int StringBuilderInit(struct StringBuilder* builder, size_t initial_size) {
Expand Down
6 changes: 1 addition & 5 deletions 3rd_party/apache-arrow-adbc/c/driver/common/utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ void AppendErrorDetail(struct AdbcError* error, const char* key, const uint8_t*

int CommonErrorGetDetailCount(const struct AdbcError* error);
struct AdbcErrorDetail CommonErrorGetDetail(const struct AdbcError* error, int index);
bool IsCommonError(const struct AdbcError* error);

struct StringBuilder {
char* buffer;
Expand All @@ -68,11 +69,6 @@ void StringBuilderReset(struct StringBuilder* builder);

#undef ADBC_CHECK_PRINTF_ATTRIBUTE

/// Wrap a single batch as a stream.
AdbcStatusCode BatchToArrayStream(struct ArrowArray* values, struct ArrowSchema* schema,
struct ArrowArrayStream* stream,
struct AdbcError* error);

/// Check an NanoArrow status code.
#define CHECK_NA(CODE, EXPR, ERROR) \
do { \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ class SqliteFlightSqlQuirks : public adbc_validation::DriverQuirks {
bool supports_get_objects() const override { return true; }
bool supports_partitioned_data() const override { return true; }
bool supports_dynamic_parameter_binding() const override { return true; }
std::string catalog() const override { return "main"; }
};

class SqliteFlightSqlTest : public ::testing::Test, public adbc_validation::DatabaseTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

include(FetchContent)

add_library(adbc_driver_framework STATIC catalog.cc objects.cc)
add_library(adbc_driver_framework STATIC objects.cc utility.cc)
adbc_configure_target(adbc_driver_framework)
set_target_properties(adbc_driver_framework PROPERTIES POSITION_INDEPENDENT_CODE ON)
target_include_directories(adbc_driver_framework
Expand Down
11 changes: 11 additions & 0 deletions 3rd_party/apache-arrow-adbc/c/driver/framework/base_driver.h
Original file line number Diff line number Diff line change
Expand Up @@ -455,11 +455,22 @@ class Driver {
}

auto error_obj = reinterpret_cast<Status*>(error->private_data);
if (!error_obj) {
return 0;
}
return error_obj->CDetailCount();
}

static AdbcErrorDetail CErrorGetDetail(const AdbcError* error, int index) {
if (error->vendor_code != ADBC_ERROR_VENDOR_CODE_PRIVATE_DATA) {
return {nullptr, nullptr, 0};
}

auto error_obj = reinterpret_cast<Status*>(error->private_data);
if (!error_obj) {
return {nullptr, nullptr, 0};
}

return error_obj->CDetail(index);
}

Expand Down
Loading
Loading