Known issues and limitations
The following limitations apply to Cloudera Streaming Analytics 1.3.
SQL Stream Builder
- CSA-1180: SSB generates directory in HDFS
- When using SQL Steam Builder a log directory is automatically created in HDFS. This directory can store a bigger amount of data which can lead to performance issues.
- CSA-1037: Schema Detection fails with invalid key
- The automatic schema detection can fail when a key is defined with invalid name. For example, when a key has special characters in the name.
- CSA-1023: SQL Stream jobs with large schemas fail when using MySQL
- The SQL Stream jobs that have large schemas will fail when you
configure SQL Stream Builder with MySQL database. The following error message appears
when you run into this
issue:
The MySQL ‘text’ data columns are limited to 64kb length. Make sure that the schema either does not exceed this value, or use the following workaround to change the ‘text’ data type to ‘longtext’ which has 4GB length._mysql_connector.MySQLInterfaceError: Data too long for column 'sb_job_data' at row 114:14
Flink
- The following SQL API features are in preview:
- Match recognize
- Top-N
- Stream-Table join (without rowtime input)
DataStream conversion limitations
- Converting between Tables and POJO DataStreams is currently not supported in CSA.
- Object arrays are not supported for Tuple conversion.
- The
java.timeclass conversions for Tuple DataStreams are only supported by using explicitTypeInformation:LegacyInstantTypeInfo,LocalTimeTypeInfo.getInfoFor(LocalDate/LocalDateTime/LocalTime.class). - Only
java.sql.Timestampis supported for rowtime conversion,java.time.LocalDateTimeis not supported.
Kudu catalog limitations
CREATE TABLE- Primary keys can only be set by the
kudu.primary-key-columnsproperty. Using thePRIMARY KEYconstraint is not yet possible. - Range partitioning is not supported.
- Primary keys can only be set by the
- When getting a table through the catalog,
NOT NULLandPRIMARY KEYconstraints are ignored. All columns are described as being nullable, and not being primary keys. - Kudu tables cannot be altered through the catalog other than simply renaming them.
Schema Registry catalog limitations
- Currently, the Schema Registry catalog / format only supports reading messages with the latest enabled schema for any given Kafka topic at the time when the SQL query was compiled.
- No time-column and watermark support for Registry tables.
- No
CREATE TABLEsupport. Schemas have to be registered directly in theSchemaRegistryto be accessible through the catalog. - The catalog is read-only. It does not support table deletions or modifications.
- By default, it is assumed that Kafka message values contain the schema id as a prefix,
because this is the default behaviour for the
SchemaRegistryKafka producer format. To consume messages with schema written in the header, the following property must be set for the Registry client:store.schema.version.id.in.header: true.
