Python API documentation¶
common¶
flink_rest_client.client module¶
- class flink_rest_client.client.FlinkRestClient¶
Bases:
object
- static get(host, port=None, version=None)¶
Constructs a new rest client instance.
- Parameters
host (str) – Hostname of Flink Jobmanager
port (int) – Port number. Default value: 8081
version (str) – Version of the REST API. Default value: v1
version –
version 1¶
flink_rest_client.v1.client module¶
- class flink_rest_client.v1.client.DatasetTrigger(prefix, trigger_id)¶
Bases:
object
- property status¶
- class flink_rest_client.v1.client.FlinkRestClientV1(host, port)¶
Bases:
object
- property api_url¶
- config()¶
Returns the configuration of the WebUI.
Endpoint: [GET] /config
- Returns
Query result as a dict.
- Return type
dict
- datasets()¶
Returns all cluster data sets.
Endpoint: [GET] /datasets
- Returns
Query result as a list of datasets.
- Return type
list
- delete_cluster()¶
Shuts down the cluster.
Endpoint: [DELETE] /cluster
- Returns
Result of delete operation.
- Return type
dict
- delete_dataset(dataset_id)¶
Triggers the deletion of a cluster data set. This async operation would return a DatasetTrigger for further query identifier.
Endpoint: [DELETE] /datasets/:datasetid
- Parameters
dataset_id (str) – 32-character hexadecimal string value that identifies a cluster data set.
- Returns
Object that can be used to query the status of delete operation.
- Return type
- property jars¶
- property jobmanager¶
- property jobs¶
- overview()¶
Returns an overview over the Flink cluster.
Endpoint: [GET] /overview
- Returns
Key-value pairs of flink cluster infos.
- Return type
dict
- property taskmanagers¶
flink_rest_client.v1.jars module¶
- class flink_rest_client.v1.jars.JarsClient(prefix)¶
Bases:
object
- all()¶
Returns a list of all jars previously uploaded via ‘/jars/upload’.
Endpoint: [GET] /jars
- Returns
List all the jars were previously uploaded.
- Return type
dict
- delete(jar_id)¶
Deletes a jar previously uploaded via ‘/jars/upload’.
Endpoint: [DELETE] /jars/:jarid
- Parameters
jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.
- Returns
True, if jar_id has been successfully deleted, otherwise False.
- Return type
bool
- Raises
RestException – If the jar_id does not exist.
- get_plan(jar_id)¶
Returns the dataflow plan of a job contained in a jar previously uploaded via ‘/jars/upload’.
Endpoint: [POST] /jars/:jarid/plan
- Parameters
jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.xe
- Returns
Details of the jar_id’s plan.
- Return type
dict
- Raises
RestException – If the jar_id does not exist.
- run(jar_id, arguments=None, entry_class=None, parallelism=None, savepoint_path=None, allow_non_restored_state=None)¶
Submits a job by running a jar previously uploaded via ‘/jars/upload’.
Endpoint: [POST] /jars/:jarid/run
- Parameters
jar_id (str) – String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars.
arguments (dict) – (Optional) Dict of program arguments.
entry_class (str) – (Optional) String value that specifies the fully qualified name of the entry point class. Overrides the class defined in the jar file manifest.
parallelism (int) – (Optional) Positive integer value that specifies the desired parallelism for the job.
savepoint_path (str) – (Optional) String value that specifies the path of the savepoint to restore the job from.
allow_non_restored_state (bool) – (Optional) Boolean value that specifies whether the job submission should be rejected if the savepoint contains state that cannot be mapped back to the job.
- Returns
32-character hexadecimal string value that identifies a job.
- Return type
str
- Raises
RestException – If the jar_id does not exist.
- upload(path_to_jar)¶
Uploads a jar to the cluster from the input path. The jar’s name will be the original filename from the input path.
Endpoint: [POST] /jars/upload
- Parameters
path_to_jar (str) – Path to the jar file.
- Returns
Result of jar upload.
- Return type
dict
- upload_and_run(path_to_jar, arguments=None, entry_class=None, parallelism=None, savepoint_path=None, allow_non_restored_state=None)¶
Helper method to upload and start a jar in one method call.
- Parameters
path_to_jar (str) – Path to the jar file.
arguments (dict) – (Optional) Comma-separated list of program arguments.
entry_class (str) – (Optional) String value that specifies the fully qualified name of the entry point class. Overrides the class defined in the jar file manifest.
parallelism (int) – (Optional) Positive integer value that specifies the desired parallelism for the job.
savepoint_path (str) – (Optional) String value that specifies the path of the savepoint to restore the job from.
allow_non_restored_state (bool) – (Optional) Boolean value that specifies whether the job submission should be rejected if the savepoint contains state that cannot be mapped back to the job.
- Returns
32-character hexadecimal string value that identifies a job.
- Return type
str
- Raises
RestException – If an error occurred during the upload of jar file.
flink_rest_client.v1.jobmanager module¶
- class flink_rest_client.v1.jobmanager.JobmanagerClient(prefix)¶
Bases:
object
- config()¶
Returns the cluster configuration.
Endpoint: [GET] /jobmanager/config
- Returns
Cluster configuration dictionary.
- Return type
dict
- get_log(log_file)¶
Returns the content of the log_file.
Endpoint: [GET] /jobmanager/logs/:log_file
- Parameters
log_file (str) – Name of the log file.
- Returns
The content of the log file as a string
- Return type
str
- logs()¶
Returns the list of log files on the JobManager.
Endpoint: [GET] /jobmanager/logs
- Returns
List of log files
- Return type
dict
- metric_names()¶
Return the supported metric names.
- Returns
List of metric names.
- Return type
list
- metrics()¶
Provides access to job manager metrics.
Endpoint: [GET] /jobmanager/metrics
- dict
Jobmanager metrics
flink_rest_client.v1.jobs module¶
- class flink_rest_client.v1.jobs.JobTrigger(prefix, type_name, job_id, trigger_id)¶
Bases:
object
- property status¶
- class flink_rest_client.v1.jobs.JobVertexClient(prefix, job_id, vertex_id)¶
Bases:
object
- backpressure()¶
Returns back-pressure information for a job, and may initiate back-pressure sampling if necessary.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/backpressure
Notes
The deprecated status means that the back pressure stats are not available.
- Returns
Backpressure information
- Return type
dict
- details()¶
Returns details for a task, with a summary for each of its subtasks.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid
- Returns
details for a task.
- Return type
dict
- metric_names()¶
Returns the supported metric names.
- Returns
List of metric names.
- Return type
list
- metrics(metric_names=None)¶
Provides access to task metrics.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/metrics
- Returns
Task metrics.
- Return type
dict
- property prefix_url¶
- property subtasks¶
- subtasktimes()¶
Returns time-related information for all subtasks of a task.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasktimes
- Returns
Time-related information for all subtasks
- Return type
dict
- taskmanagers()¶
Returns task information aggregated by task manager.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/taskmanagers
- Returns
Task information aggregated by task manager.
- Return type
dict
- watermarks()¶
Returns the watermarks for all subtasks of a task.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/watermarks
- Returns
Watermarks for all subtasks of a task.
- Return type
list
- class flink_rest_client.v1.jobs.JobVertexSubtaskClient(prefix)¶
Bases:
object
- accumulators()¶
Returns all user-defined accumulators for all subtasks of a task.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/accumulators
- Returns
User-defined accumulators
- Return type
dict
- get(subtask_id)¶
Returns details of the current or latest execution attempt of a subtask.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskindex
- Parameters
subtask_id (int) – Positive integer value that identifies a subtask.
- Returns
- Return type
dict
- get_attempt(subtask_id, attempt_id=None)¶
Returns details of an execution attempt of a subtask. Multiple execution attempts happen in case of failure/recovery.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskindex/attempts/:attempt
- Parameters
subtask_id (int) – Positive integer value that identifies a subtask.
attempt_id (int) – (Optional) Positive integer value that identifies an execution attempt. Default: current execution attempt’s id
- Returns
Details of the selected attempt.
- Return type
dict
- get_attempt_accumulators(subtask_id, attempt_id=None)¶
Returns the accumulators of an execution attempt of a subtask. Multiple execution attempts happen in case of failure/recovery.
- Parameters
subtask_id (int) – Positive integer value that identifies a subtask.
attempt_id (int) – (Optional) Positive integer value that identifies an execution attempt. Default: current execution attempt’s id
- Returns
The accumulators of the selected execution attempt of a subtask.
- Return type
dict
- metric_names()¶
Returns the supported metric names.
- Returns
List of metric names.
- Return type
list
- metrics(metric_names=None, agg_modes=None, subtask_ids=None)¶
Provides access to aggregated subtask metrics. By default it returns with all existing metric names.
Endpoint: [GET] /jobs/:jobid/vertices/:vertexid/subtasks/metrics
- Parameters
metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>
agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>
subtask_ids (list) – List of positive integers to select specific subtasks. The list of valid subtask ids is available through the subtask_ids() method. Default: <all subtasks>.
- Returns
Key-value pairs of metrics.
- Return type
dict
- property prefix_url¶
- subtask_ids()¶
Returns the subtask identifiers.
- Returns
Positive integer list of subtask ids.
- Return type
list
- class flink_rest_client.v1.jobs.JobsClient(prefix)¶
Bases:
object
- all()¶
Returns an overview over all jobs and their current state.
Endpoint: [GET] /jobs
- Returns
List of jobs and their current state.
- Return type
list
- create_savepoint(job_id, target_directory, cancel_job=False)¶
Triggers a savepoint, and optionally cancels the job afterwards. This async operation would return a JobTrigger for further query identifier.
Endpoint: [GET] /jobs/:jobid/savepoints
Notes
The target directory has to be a location accessible by both the JobManager(s) and TaskManager(s) e.g. a location on a distributed file-system or Object Store.
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
target_directory (str) – Savepoint target directory.
cancel_job (bool) – If it is True, it also stops the job after the savepoint creation.
- Returns
Object that can be used to query the status of savepoint.
- Return type
- get(job_id)¶
Returns details of a job.
Endpoint: [GET] /jobs/:jobid
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
Details of the selected job.
- Return type
dict
- get_accumulators(job_id, include_serialized_value=None)¶
Returns the accumulators for all tasks of a job, aggregated across the respective subtasks.
Endpoint: [GET] /jobs/:jobid/accumulators
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
include_serialized_value (bool) – (Optional) Boolean value that specifies whether serialized user task accumulators should be included in the response.
- Returns
Accumulators for all task.
- Return type
dict
- get_checkpoint_details(job_id, checkpoint_id, show_subtasks=False)¶
Returns details for a checkpoint.
Endpoint: [GET] /jobs/:jobid/checkpoints/details/:checkpointid
If show_subtasks is true: Endpoint: [GET] /jobs/:jobid/checkpoints/details/:checkpointid/subtasks/:vertexid
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
checkpoint_id (int) – Long value that identifies a checkpoint.
show_subtasks (bool) – If it is True, the details of the subtask are also returned.
- Returns
- Return type
dict
- get_checkpoint_ids(job_id)¶
Returns checkpoint ids of the job_id.
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
List of checkpoint ids.
- Return type
list
- get_checkpointing_configuration(job_id)¶
Returns the checkpointing configuration of the selected job_id
Endpoint: [GET] /jobs/:jobid/checkpoints/config
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
Checkpointing configuration of the selected job.
- Return type
dict
- get_checkpoints(job_id)¶
Returns checkpointing statistics for a job.
Endpoint: [GET] /jobs/:jobid/checkpoints
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
Checkpointing statistics for the selected job: counts, summary, latest and history.
- Return type
dict
- get_config(job_id)¶
Returns the configuration of a job.
Endpoint: [GET] /jobs/:jobid/config
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
Job configuration
- Return type
dict
- get_exceptions(job_id)¶
Returns the most recent exceptions that have been handled by Flink for this job.
Endpoint: [GET] /jobs/:jobid/exceptions
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
The most recent exceptions.
- Return type
dict
- get_execution_result(job_id)¶
Returns the result of a job execution. Gives access to the execution time of the job and to all accumulators created by this job.
Endpoint: [GET] /jobs/:jobid/execution-result
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
The execution result of the selected job.
- Return type
dict
- get_metrics(job_id, metric_names=None)¶
Provides access to job metrics.
Endpoint: [GET] /jobs/:jobid/metrics
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>
- Returns
Job metrics.
- Return type
dict
- get_plan(job_id)¶
Returns the dataflow plan of a job.
Endpoint: [GET] /jobs/:jobid/plan
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
Dataflow plan
- Return type
dict
- get_vertex(job_id, vertex_id)¶
Returns a JobVertexClient.
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
vertex_id (str) – 32-character hexadecimal string value that identifies a vertex.
- Returns
JobVertexClient instance that can execute vertex related queries.
- Return type
- get_vertex_ids(job_id)¶
Returns the ids of vertices of the selected job.
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
List of identifiers.
- Return type
list
- job_ids()¶
Returns the list of job_ids.
- Returns
List of job ids.
- Return type
list
- metric_names()¶
Returns the supported metric names.
- Returns
List of metric names.
- Return type
list
- metrics(metric_names=None, agg_modes=None, job_ids=None)¶
Returns an overview over all jobs.
Endpoint: [GET] /jobs/metrics
- Parameters
metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>
agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>
job_ids (list) – List of 32-character hexadecimal strings to select specific jobs. The list of valid jobs are available through the job_ids() method. Default: <all taskmanagers>.
- Returns
Aggregated job metrics.
- Return type
dict
- overview()¶
Returns an overview over all jobs.
Endpoint: [GET] /jobs/overview
- Returns
List of existing jobs.
- Return type
list
- rescale(job_id, parallelism)¶
Triggers the rescaling of a job. This async operation would return a ‘triggerid’ for further query identifier.
Endpoint: [GET] /jobs/:jobid/rescaling
Notes
Using Flink version 1.12, the method will raise RestHandlerException because this rescaling is temporarily disabled. See FLINK-12312.
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
parallelism (int) – Positive integer value that specifies the desired parallelism.
- Returns
Object that can be used to query the status of rescaling.
- Return type
- stop(job_id, target_directory, drain=False)¶
Stops a job with a savepoint. This async operation would return a JobTrigger for further query identifier.
Attention: The target directory has to be a location accessible by both the JobManager(s) and TaskManager(s) e.g. a location on a distributed file-system or Object Store.
Draining emits the maximum watermark before stopping the job. When the watermark is emitted, all event time timers will fire, allowing you to process events that depend on this timer (e.g. time windows or process functions). This is useful when you want to fully shut down your job without leaving any unhandled events or state.
Endpoint: [GET] /jobs/:jobid/stop
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
target_directory (str) – Savepoint target directory.
drain (bool) – (Optional) If it is True, it emits the maximum watermark before stopping the job. default: False
- Returns
Object that can be used to query the status of savepoint.
- Return type
- terminate(job_id)¶
Terminates a job.
Endpoint: [PATCH] /jobs/:jobid
- Parameters
job_id (str) – 32-character hexadecimal string value that identifies a job.
- Returns
True if the job has been canceled, otherwise False.
- Return type
bool
flink_rest_client.v1.taskmanagers module¶
- class flink_rest_client.v1.taskmanagers.TaskManagersClient(prefix)¶
Bases:
object
- all()¶
Returns an overview over all task managers.
Endpoint: [GET] /taskmanagers
- Returns
List of taskmanagers. Each taskmanager is represented by a dictionary.
- Return type
list
- get(taskmanager_id)¶
Returns details for a task manager.
Endpoint: [GET] /taskmanagers/:taskmanagerid
- Parameters
taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.
- Returns
Query result as a dict.
- Return type
dict
- get_logs(taskmanager_id)¶
Returns the list of log files on a TaskManager.
Endpoint: [GET] /taskmanagers/:taskmanagerid/logs
- Parameters
taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.
- Returns
List of log files in which each element contains a name and size fields.
- Return type
list
- get_metrics(taskmanager_id, metric_names=None)¶
Provides access to task manager metrics.
Endpoint: [GET] /taskmanagers/:taskmanagerid/metrics
- Parameters
taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.
metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>
- Returns
Metric name -> Metric value key-value pairs. The values are provided as strings.
- Return type
dict
- get_thread_dump(taskmanager_id)¶
Returns the thread dump of the requested TaskManager.
Endpoint: [GET] /taskmanagers/:taskmanagerid/thread-dump
- Parameters
taskmanager_id (str) – 32-character hexadecimal string that identifies a task manager.
- Returns
ThreadName -> StringifiedThreadInfo key-value pairs.
- Return type
dict
- metric_names()¶
Return the supported metric names.
- Returns
List of metric names.
- Return type
list
- metrics(metric_names=None, agg_modes=None, taskmanager_ids=None)¶
Provides access to aggregated task manager metrics. By default it returns with all existing metric names.
Endpoint: [GET] /taskmanagers/metrics
- Parameters
metric_names (list) – (optional) List of selected specific metric names. Default: <all metrics>
agg_modes (list) – (optional) List of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”. Default: <all modes>
taskmanager_ids (list) – List of 32-character hexadecimal strings to select specific task managers. The list of valid taskmanager ids are available through the taskmanager_ids() method. Default: <all taskmanagers>.
- Returns
Key-value pairs of metrics.
- Return type
dict
- taskmanager_ids()¶
Returns the list of taskmanager_ids.
- Returns
List of taskmanager ids.
- Return type
list