Python API documentation¶

common¶

flink_rest_client.client module¶

class flink_rest_client.client.FlinkRestClient¶

Bases: object

static get(host, port=None, version=None)¶

Constructs a new rest client instance.

Parameters

host (str) – Hostname of Flink Jobmanager
port (int) – Port number. Default value: 8081
version (str) – Version of the REST API. Default value: v1
version –

flink_rest_client.common module¶

exception flink_rest_client.common.RestException(*args: object)¶

Bases: Exception

Exception to catch REST API related exceptions.

version 1¶

flink_rest_client.v1.client module¶

class flink_rest_client.v1.client.DatasetTrigger(prefix, trigger_id)¶

Bases: object

property status¶

class flink_rest_client.v1.client.FlinkRestClientV1(host, port)¶

Bases: object

property api_url¶

config()¶

Returns the configuration of the WebUI.

Endpoint: [GET] /config

Returns: Query result as a dict.
Return type: dict

datasets()¶

Returns all cluster data sets.

Endpoint: [GET] /datasets

Returns: Query result as a list of datasets.
Return type: list

delete_cluster()¶

Shuts down the cluster.

Endpoint: [DELETE] /cluster

Returns: Result of delete operation.
Return type: dict

delete_dataset(dataset_id)¶

Triggers the deletion of a cluster data set. This async operation would return a DatasetTrigger for further query identifier.

Endpoint: [DELETE] /datasets/:datasetid

Parameters: dataset_id (str) – 32-character hexadecimal string value that identifies a cluster data set.
Returns: Object that can be used to query the status of delete operation.
Return type: DatasetTrigger

property jars¶

property jobmanager¶

property jobs¶

overview()¶

Returns an overview over the Flink cluster.

Endpoint: [GET] /overview

Returns: Key-value pairs of flink cluster infos.
Return type: dict

property taskmanagers¶

flink_rest_client.v1.jars module¶

class flink_rest_client.v1.jars.JarsClient(prefix)¶

Bases: object

all()¶

Returns a list of all jars previously uploaded via ‘/jars/upload’.

Endpoint: [GET] /jars

Returns: List all the jars were previously uploaded.
Return type: dict

delete(jar_id)¶

Deletes a jar previously uploaded via ‘/jars/upload’.

Endpoint: [DELETE] /jars/:jarid

Parameters: jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.
Returns: True, if jar_id has been successfully deleted, otherwise False.
Return type: bool
Raises: RestException – If the jar_id does not exist.

get_plan(jar_id)¶

Returns the dataflow plan of a job contained in a jar previously uploaded via ‘/jars/upload’.

Endpoint: [POST] /jars/:jarid/plan

Parameters: jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.xe
Returns: Details of the jar_id’s plan.
Return type: dict
Raises: RestException – If the jar_id does not exist.

run(jar_id, arguments=None, entry_class=None, parallelism=None, savepoint_path=None, allow_non_restored_state=None)¶

Submits a job by running a jar previously uploaded via ‘/jars/upload’.

Endpoint: [POST] /jars/:jarid/run

Parameters

jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.
arguments (dict) – (Optional) Dict of program arguments.
entry_class (str) – (Optional) String value that specifies the fully qualified name of the entry point class. Overrides the class defined in the jar file manifest.
parallelism (int) – (Optional) Positive integer value that specifies the desired parallelism for the job.
savepoint_path (str) – (Optional) String value that specifies the path of the savepoint to restore the job from.
allow_non_restored_state (bool) – (Optional) Boolean value that specifies whether the job submission should be rejected if the savepoint contains state that cannot be mapped back to the job.

Returns

32-character hexadecimal string value that identifies a job.

Return type

str

Raises

RestException – If the jar_id does not exist.

upload(path_to_jar)¶

Uploads a jar to the cluster from the input path. The jar’s name will be the original filename from the input path.

Endpoint: [POST] /jars/upload

Parameters: path_to_jar (str) – Path to the jar file.
Returns: Result of jar upload.
Return type: dict

upload_and_run(path_to_jar, arguments=None, entry_class=None, parallelism=None, savepoint_path=None, allow_non_restored_state=None)¶

Helper method to upload and start a jar in one method call.

Parameters

path_to_jar (str) – Path to the jar file.
arguments (dict) – (Optional) Comma-separated list of program arguments.
entry_class (str) – (Optional) String value that specifies the fully qualified name of the entry point class. Overrides the class defined in the jar file manifest.
parallelism (int) – (Optional) Positive integer value that specifies the desired parallelism for the job.
savepoint_path (str) – (Optional) String value that specifies the path of the savepoint to restore the job from.
allow_non_restored_state (bool) – (Optional) Boolean value that specifies whether the job submission should be rejected if the savepoint contains state that cannot be mapped back to the job.

Returns

32-character hexadecimal string value that identifies a job.

Return type

str

Raises

RestException – If an error occurred during the upload of jar file.

flink_rest_client.v1.jobmanager module¶

class flink_rest_client.v1.jobmanager.JobmanagerClient(prefix)¶

Bases: object

config()¶

Returns the cluster configuration.

Endpoint: [GET] /jobmanager/config

Returns: Cluster configuration dictionary.
Return type: dict

get_log(log_file)¶

Returns the content of the log_file.

Endpoint: [GET] /jobmanager/logs/:log_file

Parameters: log_file (str) – Name of the log file.
Returns: The content of the log file as a string
Return type: str

logs()¶

Returns the list of log files on the JobManager.

Endpoint: [GET] /jobmanager/logs

Returns: List of log files
Return type: dict

metric_names()¶

Return the supported metric names.

Returns: List of metric names.
Return type: list

metrics()¶

Provides access to job manager metrics.

Endpoint: [GET] /jobmanager/metrics

dict
Jobmanager metrics

flink_rest_client.v1.jobs module¶

class flink_rest_client.v1.jobs.JobTrigger(prefix, type_name, job_id, trigger_id)¶

Bases: object

property status¶

class flink_rest_client.v1.jobs.JobVertexClient(prefix, job_id, vertex_id)¶

Bases: object

backpressure()¶

Returns back-pressure information for a job, and may initiate back-pressure sampling if necessary.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/backpressure

Notes

The deprecated status means that the back pressure stats are not available.

Returns: Backpressure information
Return type: dict

details()¶

Returns details for a task, with a summary for each of its subtasks.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid

Returns: details for a task.
Return type: dict

metric_names()¶

Returns the supported metric names.

Returns: List of metric names.
Return type: list

metrics(metric_names=None)¶

Provides access to task metrics.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/metrics

Returns: Task metrics.
Return type: dict

property prefix_url¶

property subtasks¶

subtasktimes()¶

Returns time-related information for all subtasks of a task.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasktimes

Returns: Time-related information for all subtasks
Return type: dict

taskmanagers()¶

Returns task information aggregated by task manager.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/taskmanagers

Returns: Task information aggregated by task manager.
Return type: dict

watermarks()¶

Returns the watermarks for all subtasks of a task.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/watermarks

Returns: Watermarks for all subtasks of a task.
Return type: list

class flink_rest_client.v1.jobs.JobVertexSubtaskClient(prefix)¶

Bases: object

accumulators()¶

Returns all user-defined accumulators for all subtasks of a task.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/accumulators

Returns: User-defined accumulators
Return type: dict

get(subtask_id)¶

Returns details of the current or latest execution attempt of a subtask.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskindex

Parameters: subtask_id (int) – Positive integer value that identifies a subtask.
Returns
Return type: dict

get_attempt(subtask_id, attempt_id=None)¶

Returns details of an execution attempt of a subtask. Multiple execution attempts happen in case of failure/recovery.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskindex/attempts/:attempt

Parameters

subtask_id (int) – Positive integer value that identifies a subtask.
attempt_id (int) – (Optional) Positive integer value that identifies an execution attempt. Default: current execution attempt’s id

Returns

Details of the selected attempt.

Return type

dict

get_attempt_accumulators(subtask_id, attempt_id=None)¶

Returns the accumulators of an execution attempt of a subtask. Multiple execution attempts happen in case of failure/recovery.

Parameters

subtask_id (int) – Positive integer value that identifies a subtask.
attempt_id (int) – (Optional) Positive integer value that identifies an execution attempt. Default: current execution attempt’s id

Returns

The accumulators of the selected execution attempt of a subtask.

Return type

dict

metric_names()¶

Returns the supported metric names.

Returns: List of metric names.
Return type: list

metrics(metric_names=None, agg_modes=None, subtask_ids=None)¶

Provides access to aggregated subtask metrics. By default it returns with all existing metric names.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/metrics

Parameters

metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>
agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>
subtask_ids (list) – List of positive integers to select specific subtasks. The list of valid subtask ids is available through the subtask_ids() method. Default: <all subtasks>.

Returns

Key-value pairs of metrics.

Return type

dict

property prefix_url¶

subtask_ids()¶

Returns the subtask identifiers.

Returns: Positive integer list of subtask ids.
Return type: list

class flink_rest_client.v1.jobs.JobsClient(prefix)¶

Bases: object

all()¶

Returns an overview over all jobs and their current state.

Endpoint: [GET] /jobs

Returns: List of jobs and their current state.
Return type: list

create_savepoint(job_id, target_directory, cancel_job=False)¶

Triggers a savepoint, and optionally cancels the job afterwards. This async operation would return a JobTrigger for further query identifier.

Endpoint: [GET] /jobs/:jobid/savepoints

Notes

The target directory has to be a location accessible by both the JobManager(s) and TaskManager(s) e.g. a location on a distributed file-system or Object Store.

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.
target_directory (str) – Savepoint target directory.
cancel_job (bool) – If it is True, it also stops the job after the savepoint creation.

Returns

Object that can be used to query the status of savepoint.

Return type

JobTrigger

get(job_id)¶

Returns details of a job.

Endpoint: [GET] /jobs/:jobid

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: Details of the selected job.
Return type: dict

get_accumulators(job_id, include_serialized_value=None)¶

Returns the accumulators for all tasks of a job, aggregated across the respective subtasks.

Endpoint: [GET] /jobs/:jobid/accumulators

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.
include_serialized_value (bool) – (Optional) Boolean value that specifies whether serialized user task accumulators should be included in the response.

Returns

Accumulators for all task.

Return type

dict

get_checkpoint_details(job_id, checkpoint_id, show_subtasks=False)¶

Returns details for a checkpoint.

Endpoint: [GET] /jobs/:jobid/checkpoints/details/:checkpointid

If show_subtasks is true: Endpoint: [GET] /jobs/:jobid/checkpoints/details/:checkpointid/subtasks/:vertexid

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.
checkpoint_id (int) – Long value that identifies a checkpoint.
show_subtasks (bool) – If it is True, the details of the subtask are also returned.

Returns

Return type

dict

get_checkpoint_ids(job_id)¶

Returns checkpoint ids of the job_id.

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: List of checkpoint ids.
Return type: list

get_checkpointing_configuration(job_id)¶

Returns the checkpointing configuration of the selected job_id

Endpoint: [GET] /jobs/:jobid/checkpoints/config

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: Checkpointing configuration of the selected job.
Return type: dict

get_checkpoints(job_id)¶

Returns checkpointing statistics for a job.

Endpoint: [GET] /jobs/:jobid/checkpoints

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: Checkpointing statistics for the selected job: counts, summary, latest and history.
Return type: dict

get_config(job_id)¶

Returns the configuration of a job.

Endpoint: [GET] /jobs/:jobid/config

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: Job configuration
Return type: dict

get_exceptions(job_id)¶

Returns the most recent exceptions that have been handled by Flink for this job.

Endpoint: [GET] /jobs/:jobid/exceptions

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: The most recent exceptions.
Return type: dict

get_execution_result(job_id)¶

Returns the result of a job execution. Gives access to the execution time of the job and to all accumulators created by this job.

Endpoint: [GET] /jobs/:jobid/execution-result

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: The execution result of the selected job.
Return type: dict

get_metrics(job_id, metric_names=None)¶

Provides access to job metrics.

Endpoint: [GET] /jobs/:jobid/metrics

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.
metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>

Returns

Job metrics.

Return type

dict

get_plan(job_id)¶

Returns the dataflow plan of a job.

Endpoint: [GET] /jobs/:jobid/plan

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: Dataflow plan
Return type: dict

get_vertex(job_id, vertex_id)¶

Returns a JobVertexClient.

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.
vertex_id (str) – 32-character hexadecimal string value that identifies a vertex.

Returns

JobVertexClient instance that can execute vertex related queries.

Return type

JobVertexClient

get_vertex_ids(job_id)¶

Returns the ids of vertices of the selected job.

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: List of identifiers.
Return type: list

job_ids()¶

Returns the list of job_ids.

Returns: List of job ids.
Return type: list

metric_names()¶

Returns the supported metric names.

Returns: List of metric names.
Return type: list

metrics(metric_names=None, agg_modes=None, job_ids=None)¶

Returns an overview over all jobs.

Endpoint: [GET] /jobs/metrics

Parameters

metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>
agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>
job_ids (list) – List of 32-character hexadecimal strings to select specific jobs. The list of valid jobs are available through the job_ids() method. Default: <all taskmanagers>.

Returns

Aggregated job metrics.

Return type

dict

overview()¶

Returns an overview over all jobs.

Endpoint: [GET] /jobs/overview

Returns: List of existing jobs.
Return type: list

rescale(job_id, parallelism)¶

Triggers the rescaling of a job. This async operation would return a ‘triggerid’ for further query identifier.

Endpoint: [GET] /jobs/:jobid/rescaling

Notes

Using Flink version 1.12, the method will raise RestHandlerException because this rescaling is temporarily disabled. See FLINK-12312.

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.
parallelism (int) – Positive integer value that specifies the desired parallelism.

Returns

Object that can be used to query the status of rescaling.

Return type

JobTrigger

stop(job_id, target_directory, drain=False)¶

Stops a job with a savepoint. This async operation would return a JobTrigger for further query identifier.

Attention: The target directory has to be a location accessible by both the JobManager(s) and TaskManager(s) e.g. a location on a distributed file-system or Object Store.

Draining emits the maximum watermark before stopping the job. When the watermark is emitted, all event time timers will fire, allowing you to process events that depend on this timer (e.g. time windows or process functions). This is useful when you want to fully shut down your job without leaving any unhandled events or state.

Endpoint: [GET] /jobs/:jobid/stop

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.
target_directory (str) – Savepoint target directory.
drain (bool) – (Optional) If it is True, it emits the maximum watermark before stopping the job. default: False

Returns

Object that can be used to query the status of savepoint.

Return type

JobTrigger

terminate(job_id)¶

Terminates a job.

Endpoint: [PATCH] /jobs/:jobid

Parameters: job_id (str) – 32-character hexadecimal string value that identifies a job.
Returns: True if the job has been canceled, otherwise False.
Return type: bool

flink_rest_client.v1.taskmanagers module¶

class flink_rest_client.v1.taskmanagers.TaskManagersClient(prefix)¶

Bases: object

all()¶

Returns an overview over all task managers.

Endpoint: [GET] /taskmanagers

Returns: List of taskmanagers. Each taskmanager is represented by a dictionary.
Return type: list

get(taskmanager_id)¶

Returns details for a task manager.

Endpoint: [GET] /taskmanagers/:taskmanagerid

Parameters: taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.
Returns: Query result as a dict.
Return type: dict

get_logs(taskmanager_id)¶

Returns the list of log files on a TaskManager.

Endpoint: [GET] /taskmanagers/:taskmanagerid/logs

Parameters: taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.
Returns: List of log files in which each element contains a name and size fields.
Return type: list

get_metrics(taskmanager_id, metric_names=None)¶

Provides access to task manager metrics.

Endpoint: [GET] /taskmanagers/:taskmanagerid/metrics

Parameters

taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.
metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>

Returns

Metric name -> Metric value key-value pairs. The values are provided as strings.

Return type

dict

get_thread_dump(taskmanager_id)¶

Returns the thread dump of the requested TaskManager.

Endpoint: [GET] /taskmanagers/:taskmanagerid/thread-dump

Parameters: taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.
Returns: ThreadName -> StringifiedThreadInfo key-value pairs.
Return type: dict

metric_names()¶

Return the supported metric names.

Returns: List of metric names.
Return type: list

metrics(metric_names=None, agg_modes=None, taskmanager_ids=None)¶

Provides access to aggregated task manager metrics. By default it returns with all existing metric names.

Endpoint: [GET] /taskmanagers/metrics

Parameters

metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>
agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>
taskmanager_ids (list) – List of 32-character hexadecimal strings to select specific task managers. The list of valid taskmanager ids are available through the taskmanager_ids() method. Default: <all taskmanagers>.

Returns

Key-value pairs of metrics.

Return type

dict

taskmanager_ids()¶

Returns the list of taskmanager_ids.

Returns: List of taskmanager ids.
Return type: list

Python API documentation¶

common¶

flink_rest_client.client module¶

flink_rest_client.common module¶

version 1¶

flink_rest_client.v1.client module¶

flink_rest_client.v1.jars module¶

flink_rest_client.v1.jobmanager module¶

flink_rest_client.v1.jobs module¶

flink_rest_client.v1.taskmanagers module¶

Module contents¶