I’m using the Trino Hive connector to query data in my Kerberized Cloudera CDH 6.x environment.
I have a Trino master and a Trino worker installed and configured on remote servers (not on same servers as the CDH).
I configured the “hive.properties” catalog and when I’m using the Trino CLI to test the connection.
I’m able to fetch metadata when executing commands such as “show schemas from hive” and “show tables from hive.default” but when trying to retrieve data using queries such as “select * from hive.default.test” I’m getting the following error:
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: ….
I executed the command “hdfs fsck /” from my HDFS node to check if my filesystem is healthy, and indeed it is, there are no missing blocks.
Also, I tried reading the file which the Trino connector is trying to access and it was successful, I can easily check the data using “hdfs dfs -cat ”.
Also, since the HDFS namenode is highly available, I added the configuration resources path (core-site.xml and hdfs-site.xml) to the hive catalog. To note that I downloaded these resources from the node running HDFS on CDH.
I tried researching the error mentioned above but couldn’t resolve it.
Anyone has any idea about this ?