Skip to Main Content
IBM Z Software


This portal is to open public enhancement requests against IBM Z Software products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas
  1. Post an idea.

  2. Get feedback from the IBM team and other customers to refine your idea.

  3. Follow the idea through the IBM Ideas process.


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Status Planned for future release
Created by Guest
Created on May 30, 2025

Incorporate Support for Spark Connect Dataframe API into IBM Apache Spark Version 4

The Apache Spark on Z customer base at V3.5.1 cannot securely initiate a z-based spark workload from off-platform (including z/Linux, OCP, or RHEL x86). The solution dating back to Apache Spark 3.4 is the Spark Connect dataframe API. We ask IBM to incorporate this support into their Apache Spark V4 distribution.

Off-platform consumers may access mainframe data assets using IBM/DVM (aka Rocket Data Virtualization) with JDBC/ODBC from hybrid cloud or RHEL s390x / x86. Consumers can access DB2 for z/OS using the type 4 jdbc driver from off-platform. These drivers are useful for modest payloads, but only Apache Spark can deliver the uncompromising I/O and compute parallelism required for massive batch workloads.

Without Spark Connect, the off-platform consumer must innovate some method to kick off a z/OS BPXBATCH spark invocation or a spark-submit session within z/OS Unix 3.1. There is no reliable method to accompish this objective. With Spark Connect, (architecture is documented https://spark.apache.org/spark-connect/) consumers can invoke a 'remote' SparkSession instantiation.

The URL above lays out the multiple usability and support benefits of Spark Connect.

Idea priority High
  • Guest
    Aug 13, 2025

    Spark Connect uses Data Frame APIs and they are implicit in Spark Connect,

    • With classical Spark:  The client application and the Spark driver run in the same JVM and the DataFrame API calls were executed directly in that JVM.

     

    • With Spark Connect:  The client application uses the same DataFrame API, but those API calls are serialized as Protobuf messages and sent to a remote Spark driver over gRPC.
      The Spark driver executes the query plan and returns results.
  • Guest
    Aug 13, 2025

    Without Spark-Connect, the option to access Spark Cluster was old fashioned and has not a scalable solution approach:

    • spark-submit.sh or  Spark Shell were the only ways : This has not been a sustainable software design because, the client app has to co-exist with the Spark Cluster and they need to provide their entire code (as jar) to the Spark Cluster, in order to get the job done. This is more old fashioned and not a decoupled solution architecture.
    • Without Spark Connect, clients have to write their app logic only in Java or Scala. However Spark Connect makes it language agnostic. It means they can write their logic , also in GO, Python, Swift etc ..
    • Spark Connect enables clients to create Multiplexed Streaming Applications(thanks to gRPC). This  was not an easy case without SparkConnect.