Error using Spark Simba driver

Hey guys, new user here, trying out Siren to see if it will work out for our organization.

I’m trying to use a JDBC connection to our spark cluster using the Simba 4.1 driver.

Here’s how far I get:

  1. The “Test Connection” fails with a “feature not supported by driver” error (i’ll include the error from the logs below); HOWEVER,
  2. When I go to create a virtual index, it is able to list all tables and fields. I can create the virtual index.
  3. After building the data model, it won’t pull any data, such as in the “discover” tab. It gives me the same “feature not supported by driver” error.

Here is the error from the siren-distribution.log file:

[2020-10-05T19:48:56,113][INFO ][o.e.e.NodeEnvironment ] [siren-node] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [1.9tb], net total_space [1.9tb], types [rootfs]
[2020-10-05T19:48:56,115][INFO ][o.e.e.NodeEnvironment ] [siren-node] heap size [3.9gb], compressed ordinary object pointers [true]
[2020-10-05T19:48:56,287][INFO ][o.e.n.Node ] [siren-node] node name [siren-node], node ID [HT0fs8VLQ0KfVC4j1NKU4g], cluster name [siren-distribution]
[2020-10-05T19:48:56,288][INFO ][o.e.n.Node ] [siren-node] version[7.6.2], pid[7600], build[oss/tar/ef48eb35cf30adf4db14086e8aabd07ef6fb113f/2020-03-26T06:34:37.794943Z], OS[Linux/3.10.0-1062.12.1.el7.x86_64/amd64], JVM[AdoptOpenJDK/OpenJDK 64-Bit Server VM/13.0.2/13.0.2+8]
[2020-10-05T19:48:56,289][INFO ][o.e.n.Node ] [siren-node] JVM home [/opt/elasticsearch/jdk]
[2020-10-05T19:48:56,289][INFO ][o.e.n.Node ] [siren-node] JVM arguments [-Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.locale.providers=COMPAT, -Xms10g, -Xmx10g, -Des.insecure.allow.root=true, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.io.tmpdir=/tmp/elasticsearch-14186690721681472155, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Xms4g, -Xmx4g, -Des.insecure.allow.root=true, -XX:MaxDirectMemorySize=2147483648, -Des.path.home=/opt/elasticsearch, -Des.path.conf=/opt/elasticsearch/config, -Des.distribution.flavor=oss, -Des.distribution.type=tar, -Des.bundled_jdk=true]
[2020-10-05T19:48:57,873][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [aggs-matrix-stats]
[2020-10-05T19:48:57,874][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [analysis-common]
[2020-10-05T19:48:57,874][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [ingest-common]
[2020-10-05T19:48:57,874][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [ingest-geoip]
[2020-10-05T19:48:57,875][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [ingest-user-agent]
[2020-10-05T19:48:57,875][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [lang-expression]
[2020-10-05T19:48:57,875][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [lang-mustache]
[2020-10-05T19:48:57,876][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [lang-painless]
[2020-10-05T19:48:57,876][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [mapper-extras]
[2020-10-05T19:48:57,876][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [parent-join]
[2020-10-05T19:48:57,877][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [percolator]
[2020-10-05T19:48:57,877][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [rank-eval]
[2020-10-05T19:48:57,877][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [reindex]
[2020-10-05T19:48:57,878][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [repository-url]
[2020-10-05T19:48:57,878][INFO ][o.e.p.PluginsService ] [siren-node] loaded module [transport-netty4]
[2020-10-05T19:48:57,878][INFO ][o.e.p.PluginsService ] [siren-node] loaded plugin [mapper-annotated-text]
[2020-10-05T19:48:57,879][INFO ][o.e.p.PluginsService ] [siren-node] loaded plugin [siren-federate]
[2020-10-05T19:48:57,879][INFO ][o.e.p.PluginsService ] [siren-node] loaded plugin [siren-nlp]
[2020-10-05T19:49:05,669][INFO ][o.e.d.DiscoveryModule ] [siren-node] using discovery type [zen] and seed hosts providers [settings]
[2020-10-05T19:49:07,373][INFO ][o.e.n.Node ] [siren-node] initialized
[2020-10-05T19:49:07,373][INFO ][o.e.n.Node ] [siren-node] starting …
[2020-10-05T19:49:07,374][INFO ][i.s.f.c.i.b.d ] [siren-node] Buffer allocator service starting with Unsafe access: true
[2020-10-05T19:49:07,375][INFO ][i.s.f.c.i.b.d ] [siren-node] Buffer allocator service starting with directMemoryLimit=2147483648
[2020-10-05T19:49:07,415][INFO ][i.s.f.c.i.b.d ] [siren-node] Buffer allocator service starting with defaultNumDirectArenas=4
[2020-10-05T19:49:07,418][INFO ][i.s.f.c.i.b.d ] [siren-node] Instantiating root allocator with limit=1431655765
[2020-10-05T19:49:07,469][INFO ][i.s.f.c.p.j ] [siren-node] Planner service started
[2020-10-05T19:49:07,470][INFO ][i.s.f.c.i.w ] [siren-node] Starting connector query service
[2020-10-05T19:49:07,546][INFO ][i.s.f.c.h.f ] [siren-node] Starting connector jobs service
[2020-10-05T19:49:07,546][INFO ][i.s.f.c.g.h ] [siren-node] Starting virtual index service
[2020-10-05T19:49:07,547][INFO ][i.s.f.c.k.c ] [siren-node] Starting scheduler service FederateScheduler_siren-node.
[2020-10-05T19:49:07,547][INFO ][i.s.f.c.k.c ] [siren-node] Initializing the scheduler.
[2020-10-05T19:49:07,611][INFO ][o.q.c.SchedulerSignalerImpl] [siren-node] Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
[2020-10-05T19:49:07,612][INFO ][o.q.c.QuartzScheduler ] [siren-node] Quartz Scheduler v.2.2.1 created.
[2020-10-05T19:49:07,621][INFO ][o.q.s.RAMJobStore ] [siren-node] RAMJobStore initialized.
[2020-10-05T19:49:07,622][INFO ][o.q.c.QuartzScheduler ] [siren-node] Scheduler meta-data: Quartz Scheduler (v2.2.1) ‘FederateScheduler_siren-node’ with instanceId ‘FederateScheduler_siren-node’
Scheduler class: ‘org.quartz.core.QuartzScheduler’ - running locally.
NOT STARTED.
Currently in standby mode.
Number of jobs executed: 0
Using thread pool ‘org.quartz.simpl.SimpleThreadPool’ - with 4 threads.
Using job-store ‘org.quartz.simpl.RAMJobStore’ - which does not support persistence. and is not clustered.

[2020-10-05T19:49:07,623][INFO ][o.q.i.DirectSchedulerFactory] [siren-node] Quartz scheduler 'FederateScheduler_siren-node
[2020-10-05T19:49:07,623][INFO ][o.q.i.DirectSchedulerFactory] [siren-node] Quartz scheduler version: 2.2.1
[2020-10-05T19:49:07,624][INFO ][o.q.c.QuartzScheduler ] [siren-node] JobFactory set to: io.siren.federate.connector.k.a@4d2f8ee7
[2020-10-05T19:49:07,624][INFO ][o.q.c.QuartzScheduler ] [siren-node] Scheduler FederateScheduler_siren-node_$_FederateScheduler_siren-node started.
[2020-10-05T19:49:07,862][INFO ][o.e.t.TransportService ] [siren-node] publish_address {127.0.0.1:9330}, bound_addresses {127.0.0.1:9330}
[2020-10-05T19:49:08,303][INFO ][o.e.c.c.Coordinator ] [siren-node] cluster UUID [KoV1UGjaSf62XLTDGwZBvg]
[2020-10-05T19:49:08,547][INFO ][o.e.c.s.MasterService ] [siren-node] elected-as-master ([1] nodes joined)[{siren-node}{HT0fs8VLQ0KfVC4j1NKU4g}{QsCpHaLVR9SM51j90xM5Rw}{127.0.0.1}{127.0.0.1:9330}{dim}{connector.jdbc=true} elect leader, BECOME_MASTER_TASK, FINISH_ELECTION], term: 5, version: 39, delta: master node changed {previous , current [{siren-node}{HT0fs8VLQ0KfVC4j1NKU4g}{QsCpHaLVR9SM51j90xM5Rw}{127.0.0.1}{127.0.0.1:9330}{dim}{connector.jdbc=true}]}
[2020-10-05T19:49:08,700][INFO ][o.e.c.s.ClusterApplierService] [siren-node] master node changed {previous , current [{siren-node}{HT0fs8VLQ0KfVC4j1NKU4g}{QsCpHaLVR9SM51j90xM5Rw}{127.0.0.1}{127.0.0.1:9330}{dim}{connector.jdbc=true}]}, term: 5, version: 39, reason: Publication{term=5, version=39}
[2020-10-05T19:49:08,925][INFO ][o.e.h.AbstractHttpServerTransport] [siren-node] publish_address {127.0.0.1:9220}, bound_addresses {127.0.0.1:9220}
[2020-10-05T19:49:08,926][INFO ][o.e.n.Node ] [siren-node] started
[2020-10-05T19:49:09,041][INFO ][o.e.g.GatewayService ] [siren-node] recovered [4] indices into cluster_state
[2020-10-05T19:49:09,912][INFO ][o.e.c.r.a.AllocationService] [siren-node] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[ca_raw_data_ssodb_account][0], [.siren-federate-datasources][0]]]).
[2020-10-05T19:49:10,085][INFO ][i.s.f.c.g.h ] [siren-node] Checking for missing concrete indices.
[2020-10-05T19:49:10,086][INFO ][i.s.f.c.g.h ] [siren-node] Updating cluster medatata with virtual indices information.
[2020-10-05T19:49:10,134][INFO ][i.s.f.c.g.h ] [siren-node] Updated cluster metadata with connector indices.
[2020-10-05T19:49:10,179][INFO ][o.e.c.m.MetaDataIndexTemplateService] [siren-node] adding template [siren-nlp-template] for index patterns [*]
[2020-10-05T19:50:09,923][INFO ][c.z.h.HikariDataSource ] [siren-node] DSE_Dev-pool - Starting…
[2020-10-05T19:50:09,940][WARN ][c.z.h.u.DriverDataSource ] [siren-node] Registered driver with driverClassName=com.simba.spark.jdbc41.Driver was not found, trying direct instantiation.
[2020-10-05T19:50:10,437][INFO ][c.z.h.p.PoolBase ] [siren-node] DSE_Dev-pool - Driver does not support get/set network timeout for connections. ([Simba]JDBC Driver does not support this optional feature.)
[2020-10-05T19:50:10,442][INFO ][c.z.h.HikariDataSource ] [siren-node] DSE_Dev-pool - Start completed.

The line that seems relevant is: [2020-10-05T19:50:10,437][INFO ][c.z.h.p.PoolBase ] [siren-node] DSE_Dev-pool - Driver does not support get/set network timeout for connections. ([Simba]JDBC Driver does not support this optional feature.)

In the UI, I get this error message all the time:

Error: Request to Elasticsearch failed: {“error”:{“root_cause”:[{“type”:“p”,“reason”:“SQLFeatureNotSupportedException: (10220) Driver does not support this optional feature.”}],“type”:“p”,“reason”:“SQLFeatureNotSupportedException: (10220) Driver does not support this optional feature.”,“caused_by”:{“type”:“s_q_l_feature_not_supported_exception”,“reason”:“(10220) Driver does not support this optional feature.”}},“status”:500}

Again, it is able to enumerate all tables and fields, so the connection IS working.

Hi Timothy,

Thanks for reaching out to us, can you please make sure you have all these .jar files present inside the jdbc-plugins directory:

image

For more details please check the siren docs.

In addition, copy your license file to the jdbc-drivers plugin directory.

You can use this site to download the .jar files.

Regards
Manu

​Hi Manu,

When I download the drivers using the link you provided, it only contains the SparkJDBC41.jar file. None of the other files you listed are included in the download. Is there a previous version of the Simba drivers that Siren supports?

I did make sure to copy the license file to the jdbc-drivers directory as well.

Tim

Hi Tim,

We are able to replicate the issue and found it as a bug , we are working on fixing it. We will be drop-in the JDBC Spark Simba driver support soon on official docs with the steps to configure it.

Thanks
Manu

Awesome! Thanks Manu! Looking forward to being able to try it out. I am hoping this is going to work for us.

Do you know about how long it will take? I’d like to showcase Siren to folks in our organization, but we have certain deadlines for our project.

Hi Tim,

We are working on fixing onto it , will update you as soon as it is available.

Regards
Manu

Hi Tim,

We have fixed the issue with the latest federate version. It is available to download it from here.

Regards
Manu