flink-rest-client

Table of contents

  • Installation
  • API endpoint mapping
  • Usage examples
  • Python API documentation
flink-rest-client
  • »
  • flink_rest_client.v1 package
  • View page source

flink_rest_client.v1 package¶

Submodules¶

flink_rest_client.v1.client module¶

class flink_rest_client.v1.client.DatasetTrigger(prefix, trigger_id)¶

Bases: object

property status¶
class flink_rest_client.v1.client.FlinkRestClientV1(host, port)¶

Bases: object

property api_url¶
config()¶

Returns the configuration of the WebUI.

Endpoint: [GET] /config

Returns

Query result as a dict.

Return type

dict

datasets()¶

Returns all cluster data sets.

Endpoint: [GET] /datasets

Returns

Query result as a list of datasets.

Return type

list

delete_cluster()¶

Shuts down the cluster.

Endpoint: [DELETE] /cluster

Returns

Result of delete operation.

Return type

dict

delete_dataset(dataset_id)¶

Triggers the deletion of a cluster data set. This async operation would return a DatasetTrigger for further query identifier.

Endpoint: [DELETE] /datasets/:datasetid

Parameters

dataset_id (str) – 32-character hexadecimal string value that identifies a cluster data set.

Returns

Object that can be used to query the status of delete operation.

Return type

DatasetTrigger

property jars¶
property jobmanager¶
property jobs¶
overview()¶

Returns an overview over the Flink cluster.

Endpoint: [GET] /overview

Returns

Key-value pairs of flink cluster infos.

Return type

dict

property taskmanagers¶

flink_rest_client.v1.jars module¶

class flink_rest_client.v1.jars.JarsClient(prefix)¶

Bases: object

all()¶

Returns a list of all jars previously uploaded via ‘/jars/upload’.

Endpoint: [GET] /jars

Returns

List all the jars were previously uploaded.

Return type

dict

delete(jar_id)¶

Deletes a jar previously uploaded via ‘/jars/upload’.

Endpoint: [DELETE] /jars/:jarid

Parameters

jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.

Returns

True, if jar_id has been successfully deleted, otherwise False.

Return type

bool

Raises

RestException – If the jar_id does not exist.

get_plan(jar_id)¶

Returns the dataflow plan of a job contained in a jar previously uploaded via ‘/jars/upload’.

Endpoint: [POST] /jars/:jarid/plan

Parameters

jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.xe

Returns

Details of the jar_id’s plan.

Return type

dict

Raises

RestException – If the jar_id does not exist.

run(jar_id, arguments=None, entry_class=None, parallelism=None, savepoint_path=None, allow_non_restored_state=None)¶

Submits a job by running a jar previously uploaded via ‘/jars/upload’.

Endpoint: [POST] /jars/:jarid/run

Parameters
  • jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.

  • arguments (dict) – (Optional) Dict of program arguments.

  • entry_class (str) – (Optional) String value that specifies the fully qualified name of the entry point class. Overrides the class defined in the jar file manifest.

  • parallelism (int) – (Optional) Positive integer value that specifies the desired parallelism for the job.

  • savepoint_path (str) – (Optional) String value that specifies the path of the savepoint to restore the job from.

  • allow_non_restored_state (bool) – (Optional) Boolean value that specifies whether the job submission should be rejected if the savepoint contains state that cannot be mapped back to the job.

Returns

32-character hexadecimal string value that identifies a job.

Return type

str

Raises

RestException – If the jar_id does not exist.

upload(path_to_jar)¶

Uploads a jar to the cluster from the input path. The jar’s name will be the original filename from the input path.

Endpoint: [POST] /jars/upload

Parameters

path_to_jar (str) – Path to the jar file.

Returns

Result of jar upload.

Return type

dict

upload_and_run(path_to_jar, arguments=None, entry_class=None, parallelism=None, savepoint_path=None, allow_non_restored_state=None)¶

Helper method to upload and start a jar in one method call.

Parameters
  • path_to_jar (str) – Path to the jar file.

  • arguments (dict) – (Optional) Comma-separated list of program arguments.

  • entry_class (str) – (Optional) String value that specifies the fully qualified name of the entry point class. Overrides the class defined in the jar file manifest.

  • parallelism (int) – (Optional) Positive integer value that specifies the desired parallelism for the job.

  • savepoint_path (str) – (Optional) String value that specifies the path of the savepoint to restore the job from.

  • allow_non_restored_state (bool) – (Optional) Boolean value that specifies whether the job submission should be rejected if the savepoint contains state that cannot be mapped back to the job.

Returns

32-character hexadecimal string value that identifies a job.

Return type

str

Raises

RestException – If an error occurred during the upload of jar file.

flink_rest_client.v1.jobmanager module¶

class flink_rest_client.v1.jobmanager.JobmanagerClient(prefix)¶

Bases: object

config()¶

Returns the cluster configuration.

Endpoint: [GET] /jobmanager/config

Returns

Cluster configuration dictionary.

Return type

dict

get_log(log_file)¶

Returns the content of the log_file.

Endpoint: [GET] /jobmanager/logs/:log_file

Parameters

log_file (str) – Name of the log file.

Returns

The content of the log file as a string

Return type

str

logs()¶

Returns the list of log files on the JobManager.

Endpoint: [GET] /jobmanager/logs

Returns

List of log files

Return type

dict

metric_names()¶

Return the supported metric names.

Returns

List of metric names.

Return type

list

metrics()¶

Provides access to job manager metrics.

Endpoint: [GET] /jobmanager/metrics

dict

Jobmanager metrics

flink_rest_client.v1.jobs module¶

class flink_rest_client.v1.jobs.JobTrigger(prefix, type_name, job_id, trigger_id)¶

Bases: object

property status¶
class flink_rest_client.v1.jobs.JobVertexClient(prefix, job_id, vertex_id)¶

Bases: object

backpressure()¶

Returns back-pressure information for a job, and may initiate back-pressure sampling if necessary.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/backpressure

Notes

The deprecated status means that the back pressure stats are not available.

Returns

Backpressure information

Return type

dict

details()¶

Returns details for a task, with a summary for each of its subtasks.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid

Returns

details for a task.

Return type

dict

metric_names()¶

Returns the supported metric names.

Returns

List of metric names.

Return type

list

metrics(metric_names=None)¶

Provides access to task metrics.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/metrics

Returns

Task metrics.

Return type

dict

property prefix_url¶
property subtasks¶
subtasktimes()¶

Returns time-related information for all subtasks of a task.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasktimes

Returns

Time-related information for all subtasks

Return type

dict

taskmanagers()¶

Returns task information aggregated by task manager.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/taskmanagers

Returns

Task information aggregated by task manager.

Return type

dict

watermarks()¶

Returns the watermarks for all subtasks of a task.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/watermarks

Returns

Watermarks for all subtasks of a task.

Return type

list

class flink_rest_client.v1.jobs.JobVertexSubtaskClient(prefix)¶

Bases: object

accumulators()¶

Returns all user-defined accumulators for all subtasks of a task.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/accumulators

Returns

User-defined accumulators

Return type

dict

get(subtask_id)¶

Returns details of the current or latest execution attempt of a subtask.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskindex

Parameters

subtask_id (int) – Positive integer value that identifies a subtask.

Returns

Return type

dict

get_attempt(subtask_id, attempt_id=None)¶

Returns details of an execution attempt of a subtask. Multiple execution attempts happen in case of failure/recovery.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskindex/attempts/:attempt

Parameters
  • subtask_id (int) – Positive integer value that identifies a subtask.

  • attempt_id (int) – (Optional) Positive integer value that identifies an execution attempt. Default: current execution attempt’s id

Returns

Details of the selected attempt.

Return type

dict

get_attempt_accumulators(subtask_id, attempt_id=None)¶

Returns the accumulators of an execution attempt of a subtask. Multiple execution attempts happen in case of failure/recovery.

Parameters
  • subtask_id (int) – Positive integer value that identifies a subtask.

  • attempt_id (int) – (Optional) Positive integer value that identifies an execution attempt. Default: current execution attempt’s id

Returns

The accumulators of the selected execution attempt of a subtask.

Return type

dict

metric_names()¶

Returns the supported metric names.

Returns

List of metric names.

Return type

list

metrics(metric_names=None, agg_modes=None, subtask_ids=None)¶

Provides access to aggregated subtask metrics. By default it returns with all existing metric names.

Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/metrics

Parameters
  • metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>

  • agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>

  • subtask_ids (list) – List of positive integers to select specific subtasks. The list of valid subtask ids is available through the subtask_ids() method. Default: <all subtasks>.

Returns

Key-value pairs of metrics.

Return type

dict

property prefix_url¶
subtask_ids()¶

Returns the subtask identifiers.

Returns

Positive integer list of subtask ids.

Return type

list

class flink_rest_client.v1.jobs.JobsClient(prefix)¶

Bases: object

all()¶

Returns an overview over all jobs and their current state.

Endpoint: [GET] /jobs

Returns

List of jobs and their current state.

Return type

list

create_savepoint(job_id, target_directory, cancel_job=False)¶

Triggers a savepoint, and optionally cancels the job afterwards. This async operation would return a JobTrigger for further query identifier.

Endpoint: [GET] /jobs/:jobid/savepoints

Notes

The target directory has to be a location accessible by both the JobManager(s) and TaskManager(s) e.g. a location on a distributed file-system or Object Store.

Parameters
  • job_id (str) – 32-character hexadecimal string value that identifies a job.

  • target_directory (str) – Savepoint target directory.

  • cancel_job (bool) – If it is True, it also stops the job after the savepoint creation.

Returns

Object that can be used to query the status of savepoint.

Return type

JobTrigger

get(job_id)¶

Returns details of a job.

Endpoint: [GET] /jobs/:jobid

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

Details of the selected job.

Return type

dict

get_accumulators(job_id, include_serialized_value=None)¶

Returns the accumulators for all tasks of a job, aggregated across the respective subtasks.

Endpoint: [GET] /jobs/:jobid/accumulators

Parameters
  • job_id (str) – 32-character hexadecimal string value that identifies a job.

  • include_serialized_value (bool) – (Optional) Boolean value that specifies whether serialized user task accumulators should be included in the response.

Returns

Accumulators for all task.

Return type

dict

get_checkpoint_details(job_id, checkpoint_id, show_subtasks=False)¶

Returns details for a checkpoint.

Endpoint: [GET] /jobs/:jobid/checkpoints/details/:checkpointid

If show_subtasks is true: Endpoint: [GET] /jobs/:jobid/checkpoints/details/:checkpointid/subtasks/:vertexid

Parameters
  • job_id (str) – 32-character hexadecimal string value that identifies a job.

  • checkpoint_id (int) – Long value that identifies a checkpoint.

  • show_subtasks (bool) – If it is True, the details of the subtask are also returned.

Returns

Return type

dict

get_checkpoint_ids(job_id)¶

Returns checkpoint ids of the job_id.

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

List of checkpoint ids.

Return type

list

get_checkpointing_configuration(job_id)¶

Returns the checkpointing configuration of the selected job_id

Endpoint: [GET] /jobs/:jobid/checkpoints/config

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

Checkpointing configuration of the selected job.

Return type

dict

get_checkpoints(job_id)¶

Returns checkpointing statistics for a job.

Endpoint: [GET] /jobs/:jobid/checkpoints

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

Checkpointing statistics for the selected job: counts, summary, latest and history.

Return type

dict

get_config(job_id)¶

Returns the configuration of a job.

Endpoint: [GET] /jobs/:jobid/config

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

Job configuration

Return type

dict

get_exceptions(job_id)¶

Returns the most recent exceptions that have been handled by Flink for this job.

Endpoint: [GET] /jobs/:jobid/exceptions

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

The most recent exceptions.

Return type

dict

get_execution_result(job_id)¶

Returns the result of a job execution. Gives access to the execution time of the job and to all accumulators created by this job.

Endpoint: [GET] /jobs/:jobid/execution-result

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

The execution result of the selected job.

Return type

dict

get_metrics(job_id, metric_names=None)¶

Provides access to job metrics.

Endpoint: [GET] /jobs/:jobid/metrics

Parameters
  • job_id (str) – 32-character hexadecimal string value that identifies a job.

  • metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>

Returns

Job metrics.

Return type

dict

get_plan(job_id)¶

Returns the dataflow plan of a job.

Endpoint: [GET] /jobs/:jobid/plan

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

Dataflow plan

Return type

dict

get_vertex(job_id, vertex_id)¶

Returns a JobVertexClient.

Parameters
  • job_id (str) – 32-character hexadecimal string value that identifies a job.

  • vertex_id (str) – 32-character hexadecimal string value that identifies a vertex.

Returns

JobVertexClient instance that can execute vertex related queries.

Return type

JobVertexClient

get_vertex_ids(job_id)¶

Returns the ids of vertices of the selected job.

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

List of identifiers.

Return type

list

job_ids()¶

Returns the list of job_ids.

Returns

List of job ids.

Return type

list

metric_names()¶

Returns the supported metric names.

Returns

List of metric names.

Return type

list

metrics(metric_names=None, agg_modes=None, job_ids=None)¶

Returns an overview over all jobs.

Endpoint: [GET] /jobs/metrics

Parameters
  • metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>

  • agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>

  • job_ids (list) – List of 32-character hexadecimal strings to select specific jobs. The list of valid jobs are available through the job_ids() method. Default: <all taskmanagers>.

Returns

Aggregated job metrics.

Return type

dict

overview()¶

Returns an overview over all jobs.

Endpoint: [GET] /jobs/overview

Returns

List of existing jobs.

Return type

list

rescale(job_id, parallelism)¶

Triggers the rescaling of a job. This async operation would return a ‘triggerid’ for further query identifier.

Endpoint: [GET] /jobs/:jobid/rescaling

Notes

Using Flink version 1.12, the method will raise RestHandlerException because this rescaling is temporarily disabled. See FLINK-12312.

Parameters
  • job_id (str) – 32-character hexadecimal string value that identifies a job.

  • parallelism (int) – Positive integer value that specifies the desired parallelism.

Returns

Object that can be used to query the status of rescaling.

Return type

JobTrigger

stop(job_id, target_directory, drain=False)¶

Stops a job with a savepoint. This async operation would return a JobTrigger for further query identifier.

Attention: The target directory has to be a location accessible by both the JobManager(s) and TaskManager(s) e.g. a location on a distributed file-system or Object Store.

Draining emits the maximum watermark before stopping the job. When the watermark is emitted, all event time timers will fire, allowing you to process events that depend on this timer (e.g. time windows or process functions). This is useful when you want to fully shut down your job without leaving any unhandled events or state.

Endpoint: [GET] /jobs/:jobid/stop

Parameters
  • job_id (str) – 32-character hexadecimal string value that identifies a job.

  • target_directory (str) – Savepoint target directory.

  • drain (bool) – (Optional) If it is True, it emits the maximum watermark before stopping the job. default: False

Returns

Object that can be used to query the status of savepoint.

Return type

JobTrigger

terminate(job_id)¶

Terminates a job.

Endpoint: [PATCH] /jobs/:jobid

Parameters

job_id (str) – 32-character hexadecimal string value that identifies a job.

Returns

True if the job has been canceled, otherwise False.

Return type

bool

flink_rest_client.v1.taskmanagers module¶

class flink_rest_client.v1.taskmanagers.TaskManagersClient(prefix)¶

Bases: object

all()¶

Returns an overview over all task managers.

Endpoint: [GET] /taskmanagers

Returns

List of taskmanagers. Each taskmanager is represented by a dictionary.

Return type

list

get(taskmanager_id)¶

Returns details for a task manager.

Endpoint: [GET] /taskmanagers/:taskmanagerid

Parameters

taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.

Returns

Query result as a dict.

Return type

dict

get_logs(taskmanager_id)¶

Returns the list of log files on a TaskManager.

Endpoint: [GET] /taskmanagers/:taskmanagerid/logs

Parameters

taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.

Returns

List of log files in which each element contains a name and size fields.

Return type

list

get_metrics(taskmanager_id, metric_names=None)¶

Provides access to task manager metrics.

Endpoint: [GET] /taskmanagers/:taskmanagerid/metrics

Parameters
  • taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.

  • metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>

Returns

Metric name -> Metric value key-value pairs. The values are provided as strings.

Return type

dict

get_thread_dump(taskmanager_id)¶

Returns the thread dump of the requested TaskManager.

Endpoint: [GET] /taskmanagers/:taskmanagerid/thread-dump

Parameters

taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.

Returns

ThreadName -> StringifiedThreadInfo key-value pairs.

Return type

dict

metric_names()¶

Return the supported metric names.

Returns

List of metric names.

Return type

list

metrics(metric_names=None, agg_modes=None, taskmanager_ids=None)¶

Provides access to aggregated task manager metrics. By default it returns with all existing metric names.

Endpoint: [GET] /taskmanagers/metrics

Parameters
  • metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>

  • agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>

  • taskmanager_ids (list) – List of 32-character hexadecimal strings to select specific task managers. The list of valid taskmanager ids are available through the taskmanager_ids() method. Default: <all taskmanagers>.

Returns

Key-value pairs of metrics.

Return type

dict

taskmanager_ids()¶

Returns the list of taskmanager_ids.

Returns

List of taskmanager ids.

Return type

list

Module contents¶


© Copyright 2021, frego.dev.

Built with Sphinx using a theme provided by Read the Docs.