Administration Guide for Cisco Media Experience Engine 3500 Release 3.3
Job Monitoring and Management

Table Of Contents

Job Monitoring and Management

Job Status

Job Status Overview

Monitoring Jobs

Monitoring Tasks

Viewing Errors

Viewing Output Clip

Viewing Directory/Watch Status

Showing Job XML

Rescheduling Jobs

Stopping Jobs

Deleting Jobs

Resetting Job Priority

Filtering Jobs

Timed Job Status

Timed Job Status Overview

Working with Jobs in Timed Job Monitor

Cancelling Future Timed Jobs

Pausing and Removing Timed Jobs

System Status

System Status Overview

Working with the System Status Monitor

Health Status

Health Status Overview

Color

Health Counter

Working with the Health Status Monitor


Job Monitoring and Management


This section includes the following topics:

Job Status

Timed Job Status

System Status

Health Status

Job Status

This section includes the following topics:

Job Status Overview

Monitoring Jobs

Monitoring Tasks

Viewing Output Clip

Viewing Directory/Watch Status

Showing Job XML

Rescheduling Jobs

Stopping Jobs

Deleting Jobs

Resetting Job Priority

Filtering Jobs

Job Status Overview

View job status and perform tasks related to job status from with the Job Status Monitor. It displays all jobs that have not been reaped (deleted by the system).

To access the Job Status Monitor:

From the toolbox, select Monitoring > Job Status

OR

From the main menu, select View > Monitoring > Job Status

See Figure 15-1.

Figure 15-1 Job Status Monitor Upper Pane

The Job Status Monitor upper pane displays the jobs that are currently pending, running, complete, or failed. Jobs are color coded based on their status. See also: Monitoring Jobs.

The jobs displayed may disappear as you are viewing them because the system automatically removes (reaps) jobs based on the Auto Reap Interval. The Auto Reap Interval specifies how long job information displays on the Job Status Monitor before it is cleared from the Monitor. When the system reaps jobs, it removes data that has been processed and completed. The Auto Reap Interval begins from the time a job completes (or when it fails).

The upper pane of the Job Status Monitor provides job information as described in Table 15-1.

Table 15-1 Job Status Fields  

Field Name
Description

Job ID

Displays the job ID number as generated by the host.

Job Profile Name

Displays the name of the job profile that was defined when the watch was set up.

Title

Displays the job title that was defined when the watch was set up.

Author

Displays the author of the job that was defined when the watch was set up.

Submit Time

Displays the time when the job was automatically submitted for processing. This column can be sorted by last submitted job or by first submitted job.

Priority

Displays the job priority that was defined when the watch was set up. Priority can be 1 - 100, with 1 being the highest priority.

Status

Displays the status of the job as it is being processed.

Values are:

Pending: The job is currently in the queue and has not started.

Running: The job is currently running.

Completed: The job has successfully completed.

Failed: The job failed or the user manually stopped the job.

Stopped: User stopped the job.



Note Click any of the headings (Job ID, Job Profile Name, etc.) at the top of the Job Status Monitor to sort the open jobs by the selected field. By default, jobs are sorted from most recently submitted to earliest submitted.


If all jobs do not display, use the scroll bars to view the remaining jobs.

Job Options

On the Job Status Monitor page, click the arrow to the right of the Job Options button to display the following. Most of the options are self-explanatory, except that Reschedule resubmits the entire job from scratch, and Retry resubmits failed or dependent tasks only. Retry is particularly useful, for example, when the encoding has completed but distribution fails. See Figure 15-2.

Figure 15-2 Job Options

See also: Monitoring Tasks.

Monitoring Jobs

Monitor the status of all jobs submitted in the Cisco MXE 3500 system from the Job Status Monitor page.

To access the page, from the Toolbox, expand Monitoring, and click Job Status.

Each job contains multiple tasks. To view the tasks associated with a job and their status, double-click the job row in the upper pane (shown here in blue). The tasks display in the lower pane on the Tasks tab. See Figure 15-3.

Figure 15-3 Job Status Monitor Page

This page shows several jobs that are in progress or that have recently been completed. Jobs are color-coded based on the status described in Table 15-2.

Table 15-2 Job Status Color Coding  

Status
Color
Description

Pending

Yellow

The job has been submitted, but work has not yet begun.

Running

Green

The job has been submitted and work has begun. The job stays in Running status until all tasks in the Job Profile have been executed or until the job is determined to have failed.

Completed

Blue

All the tasks in the job profile have completed successfully.

Failed

Red

One or more tasks in the Job Profile could not be completed successfully. For example, if communication with an FTP service cannot be established, the job will fail because the distribution task cannot be completed successfully. Similarly, if you stop a job, it will fail with the following error message: user stop request.

If a job fails, select the Errors tab for a summary of errors that have occurred. (To obtain additional details on why jobs failed, contact your Cisco MXE 3500 administrator.) Take the necessary actions to correct any jobs that have failed, and resubmit or reschedule the job.

You may also view the XML code for a selected job for more detail on how it is being processed. See also: Showing Job XML.

Stopped

Orange

User stopped the job.



Tip If all of the jobs are not displayed, use the scroll bars to view the remaining jobs.


Monitoring Tasks

The lower pane of the Job Status Monitor displays job tasks or job errors, depending on which tab you select. Double-click a job in the upper pane to display its tasks or errors in the lower pane. See Figure 15-4.

Figure 15-4 Job Status Monitor Tasks

Each task within the job, and its status, are listed. Task fields are described in Table 15-3.

Table 15-3 Task Field  

Field Name
Description

Task ID

Displays a unique numerical ID the Cisco MXE 3500 assigns to each task within the job.

Task Type

The task type represents the specific type of task that is executed by a given worker (examples: Preprocessor, Flash encoder, File Manager, etc.) on a specific node. The tasks are defined by the Job Profile selected for the job. See also: Job Profiles.

Begin Time

Displays the time when the task was started.

Complete Time

Displays the time when the task was completed.

% Complete

Displays the percentage of the task that is currently complete.

Note For Speech to Text tasks, the numbers displayed here represent time elapsed.

Task Status

Displays the current status of the task.

Possible values are:

Dependent Task: Execution is dependent on one or more other task status events (start or complete events).

Pending: The task is waiting to be scheduled.

Provisioned Task: The task has acquired license(s) and a node worker. It is now waiting for notification from the LCS that the task has started.

Running: The task is being executed by a worker on an LCS node.

StopRequest: The task is being halted by the scheduler or operator. A terminate request was sent to the LCS. Waiting for a confirmation from LCS that the task is complete.

Succeeded: The task has completed successfully.

Failed: The task failed on the LCS or was invalid.

UserStopped: The task was stopped at the request of the operator. It will not be rescheduled.

ConditionNotMet: The task cannot be run because a start condition will never be met.

Preempted: The system or operator preempted the task execution. Task should be rescheduled.


Viewing Errors

Click the Errors tab to view task error information as described in Table 15-4.

Table 15-4 Error Fields  

Field Name
Description

Task ID

Displays the ID number of the task that was running when the error occurred.

Task Type

Describes the type of task that was being performed when the error occurred.

Failure Message

Describes the error. Typically, these are warning or error level messages returned from a given worker executing a task.


Error Types and Possible Solutions

There are many types of errors that might display, including the following:

Network errors or permission issues: Try rescheduling the job to see if the network errors clear, and/or recheck permissions. (To obtain additional details on network and permission issues, contact your Cisco MXE 3500 administrator.)

Errors related to Folder Attendant not running: View the Folder Attendant Log to determine a possible cause.

Errors related to the system not running: Contact your Cisco MXE 3500 administrator.

Errors related to jobs failing: Check to see that job profiles are set correctly and that valid media is chosen for that profile, and resubmit the job.

See also: Troubleshooting Cisco MXE 3500.

Viewing Output Clip

To view the output clip, from the Tasks menu, right-click a task, and click View Output Clip. See Figure 15-5.

Figure 15-5 Viewing Output Clip from Job Monitor Tasks Menu


Note You may only view clips from the same domain on which the clip resides.


Viewing Directory/Watch Status

The Folder Attendant Administration page shows the directories and watches that have been defined. See Figure 15-6.

Figure 15-6 Configured Directories and Watches

If a directory has been defined, but a watch has not been defined for the directory, the Profile, State and # Files fields are blank for the directory. If a watch has been defined for the directory, those fields are populated.

Table 15-5 shows the field that are displayed.

Table 15-5 Folder Attendant Administration Page Fields  

Field Name
Description

Directory

Displays the name of the directory currently being monitored. This information is entered when you add a new directory.

Profile

Displays the profile of the watch, as defined in your Cisco MXE 3500 system that applies to the managed directory. A watch is a unique combination of the Directory and Profile. This information is entered when you add a new watch. If this field is blank, a watch has not been setup for this directory.

Priority

Displays the job priority of the watch. This information is entered when you add a new watch. If this field is blank, a watch has not been set up for this directory.

State

Displays the availability of the monitored directory.

Possible values are:

Online: Directory is currently being monitored.

Offline: Directory cannot be accessed by Folder Attendant for monitoring (probably because of an error).

Disabled: User has disabled the directory so it cannot be monitored.

# Files

Displays the number of files (media or XML) submitted in the monitored directory. The information is filled in automatically from the Cisco MXE 3500. If this field is blank, a watch has not been set up for this directory.


You can also filter the directories that are displayed in this page to view only those directories of interest.

Showing Job XML

Job XML provides detailed instructions used by the Cisco MXE 3500 system to execute a job. If you encounter any job submission problems, the Cisco MXE 3500 Technical Support Team may request XML code (and log files) to assist them in troubleshooting the issues.

Procedure


Step 1 Access the Job Status Monitor page.

Step 2 Select the job, and click Show Job XML. See Figure 15-7.

Figure 15-7 Show Job XML

Step 3 The XML code displays on a new page.

Step 4 If all of the XML code is not displayed in the page, use the scroll arrows on the right side of the page to view all the code.

Step 5 When you are done viewing the XML, select the X in the top right corner to return to the Job Status Monitor page.


Rescheduling Jobs

Rescheduling a job will re-queue it. If the job is currently running, all of its tasks are stopped, and then the job is rescheduled. If you reschedule a job that has failed, it will attempt to run again, as soon as it is able. When you reschedule jobs, you do not have the option of specifying an exact time when they will run.

If there was a network problem that prevented the job from running, you can reschedule the job after the network problem clears to attempt to process it successfully. However, if the job failed because of a problem with the profile, examine the Error tab on the Job Status Monitor page and the LCS log file, make the necessary changes, and then resubmit the job.

Procedure


Step 1 Access the Job Status Monitor page.

Step 2 Select the job(s), and from the Job Options drop-down, click Reschedule. See Figure 15-8.

Figure 15-8 Select Job to be Rescheduled

A message displays at the top of the page indicating that the job has been successfully rescheduled. See Figure 15-9.

Figure 15-9 Successful Reschedule Message

Step 3 Double-click the job to monitor its progress.


Stopping Jobs

You may choose to stop a job for a number of reasons: You may have chosen the wrong profile, or the job may be taking too long to process and you want to stop it to free up resources for other more critical jobs.

If you stop a job, the status of the job will change to Stopped.

Procedure


Step 1 To stop a job, access the Job Status Monitor page.

Step 2 Select the job(s), and from the Job Options drop-down (or right-click menu), click Stop. See Figure 15-10. A stop confirmation message displays.

Figure 15-10 Select the Job(s) to Stop

Step 3 Select OK to stop the selected job(s). A message displays at the top of the Job Status Monitor page indicating the ID number of the job that was stopped. The Status field updates with the current status (failed).

Step 4 Select the Errors tab to view the Failure Message. See Figure 15-11.

Figure 15-11 Selected Jobs Have Been Stopped


Deleting Jobs

When you delete a job, it no longer appears in the status monitor and cannot be stopped, rescheduled, or viewed. Any job (in any state) can be deleted.

Procedure


Step 1 Access the Job Status Monitor page.

Step 2 Select the job, and from the Job Options drop-down, select Delete. See Figure 15-12. A delete confirmation message displays.

Figure 15-12 Select the Job to be Deleted

Step 3 Select OK to delete the selected job(s). A message displays indicating which job has been deleted. The deleted job is removed from the job list. See Figure 15-13.

Figure 15-13 Selected Job Has Been Deleted


Resetting Job Priority

Increase or decrease the priority of a job to change the order in which jobs are processed if multiple jobs are pending. Job priority can be set from 1-100 with 1 as highest priority and 100 as lowest priority.

Jobs with higher priority (a lower priority number) will be processed before jobs with lower priority.


Note Job Priority is a goal for the Cisco MXE 3500 system. Due to resource availability and the job profile selected, a lower priority job may still be scheduled before a higher priority job. There are also special cases where certain higher priority jobs can preempt a lower priority job (as in the case with live jobs) if there are no resources available.

You can only set (or reset) job priority if you have a Resource Manager license.


Procedure


Step 1 Access the Job Status Monitor page.

Step 2 Select the job(s), and from the Job Options drop-down menu, select Reset Job Priority. See Figure 15-14.

Figure 15-14 Select the Jobs for which Priority Will be Reset

Step 3 A Reset Job Priority pop-up displays. Enter the new number (1-100), and click Set Priority. The following message displays, and the Priority field is updated. See Figure 15-15.

Figure 15-15 Priority for the Selected Job Reset


Filtering Jobs

The Filter button on the Job Status Monitor page allows you to display a subset of all the jobs. Filter jobs using any of the following parameters (or any combination of these parameters):

Job ID

Job Profile Name

Title

Author

Submit Time

Priority

Status


Note Even if jobs are filtered, they are still being processed as usual. This function only limits the number of jobs displayed on the page.


Procedure


Step 1 Access the Job Status Monitor page. See Figure 15-16.

Figure 15-16 Jobs Before Filters Have Been Applied

Step 2 Select the Filter button from the menu bar. The Job Status Filter pop-up displays. See Figure 15-17.

Figure 15-17 Job Status Filter Pop-Up

Step 3 Complete one or more fields to specify how to filter the job status display. For example, if you enter All Streaming in the Job Profile field, that means that only the jobs that have the All Streaming profile are displayed. The filtering fields are described in the following table:

Table 15-6 Directory Filter Fields  

Field Name
Description

Job ID

Enter the unique numerical Job ID for the job to be displayed.

Job Profile Name

Enter the name of the job profile for the job(s) to be displayed.

Title

Enter the title of the job to be displayed.

Priority

Enter a numerical priority (between 1 and 100). If the priority for the selected job matches this priority, the job will be displayed.

Author

Enter the author of the job(s) to be displayed.

Status

Select the status of the job(s) to be displayed from the drop-down menu.

Submit Time

Select a start date and time, an end date and time, or both by checking the appropriate boxes. Enter the start and finish data using the calendar selection box to the right of the data fields. Enter the start and end time in hh:mm:ss format.


Step 4 When you have complete the desired fields, click Set Filter. The Job Status Monitor page is updated and displays only jobs matching the filter fields.


Timed Job Status

This section includes the following topics:

Timed Job Status Overview

Working with Jobs in Timed Job Monitor

Cancelling Future Timed Jobs

Pausing and Removing Timed Jobs

Timed Job Status Overview

The Timed Job Status page is used to display summary information on timed jobs that are essentially on hold until their designated Start Time. Timed jobs are created by checking the Enable Timed Submission box on the Job Submission page. See Figure 15-18.

Figure 15-18 Timed Job Status Monitor

Table 15-7 Timed Job Status Monitor Headings and Descriptions 

Heading
Description

Job ID

Displays the job ID number as generated by the host.

Title

Displays the job title that was defined when the watch was set up.

Author

Displays the user who submitted the job or the user-supplied information added in the Author metadata field on the Job Submission page.

Start Time

Indicates the date and time that the job is scheduled to begin. These values are set in the Start Date and Start Time fields of the Job Submission page.

Priority

Displays the number that corresponds to the priority assigned on the Job Submission page. Priority can be between 1 and 100, with 1 having the highest priority.

Last Added

Displays the last time a recurring job was added. Recurring jobs are submitted when the first instance is processed, and again with each new instance. The Last Added date will reflect the date and time that the last instance of the job was submitted.

Period

Displays the Repeat Interval for the job in seconds. The Repeat Interval is defined using the Repeat Every or the Repeat Interval field on the Job Submission page.

Status

Displays the status of the job as it is being processed.

Possible values are:

Active: Identifies jobs that are set to execute at the time assigned as the Start Time. Active Jobs are identified by a blue background.

Inactive: Identifies jobs that have been paused by a user. Inactive jobs are identified by a yellow background.

Completed: This one time only job has finished.


Working with Jobs in Timed Job Monitor

Figure 15-19 shows available Timed Job Monitor options. Table 15-8 describes the options.

Figure 15-19 Timed Job Monitor Job Options

Table 15-8 Timed Job Options and Descriptions

Job Option
Description

Delete

Deletes the job from the Timed Job Monitor. This ends the cycle of submission for recurring timed jobs.

Note Recurring jobs that are no longer needed should be deleted. Leaving unnecessary recurring jobs in the Timed Status view means that the jobs will continue to be submitted. This will result in either unnecessary jobs being processed, or failed jobs because all of the requirements for the timed job are no longer being met.

Pause

Temporarily prevents the job from processing, even if the Start Time arrives. Pausing a job changes the status of the job from Active to Inactive.

Resume

Changes an Inactive (paused) job to Active.


Cancelling Future Timed Jobs

Procedure


Step 1 In the Timed Job Monitor, right-click a job.

Step 2 Click Delete Job.

Step 3 When the delete confirmation pop-up displays, click OK.


Pausing and Removing Timed Jobs

Procedure


Step 1 To pause a timed job:

a. In the Timed Job Monitor, right-click a job.

b. Click Pause Job. The job moves to the top of the list, and displays a status of Inactive.

Step 2 To resume a timed job:

a. In the Timed Job Monitor, right-click a job.

b. Click Resume Job. The job moves back to its original position, and displays a status of Active.


System Status

This section includes the following topics:

System Status Overview

Working with the System Status Monitor

System Status Overview

View information about system components currently involved in processing jobs with the System Status Monitor. This page displays one line of information for each host in the system. Each line contains bars that represent an encoder or other worker.

To access the System Status Monitor:

From the Toolbox, select Monitoring > System Status

or

From the main menu, select View > Monitoring > System Status

See Figure 15-20.

Figure 15-20 System Status Monitor

The name of the host is displayed in the first column, followed by bars which represent the tasks currently running on the host.

The colored bars for each task indicate the type of worker that is running, the Job ID, and the percentage of the task that is complete.

For example, the two colored bars below indicate:

A Microsoft encoder running Job ID #28 is 2% complete.

A prefilter running Job ID #28 is 0% complete.

If the status area extends beyond the visible area, use the horizontal scroll bar at the bottom of the page to view all tasks for the host.

The status area only shows tasks that are currently running. Once tasks are complete, they no longer display. Similarly, encoders and other workers for which you do not hold that license will not run, and therefore, will not appear on the System Status Monitor.


Note The Max Cap value that appears on the right side of the pane displays the maximum number of tasks that can run on one node at one time. Capacity is set on the Host Administration page. See also: Host Administration.


Working with the System Status Monitor

The System Status Monitor allows you to interact with running tasks. See Figure 15-21 shows the options. Table 15-9 describes the options.

Figure 15-21 System Status Monitor Right-click Options

Table 15-9 System Status Options and Descriptions  

Option
Description

Set Offline

Sets the worker offline, making it temporarily inaccessible to the ECS. A currently running task will be completed before the worker is made unavailable.

Preempt

Takes the selected task away from the host so that the next available task can start immediately. The preempted task is maintained in the queue to be run as soon as a resource is available.

Preempting a task is not the same as stopping a job. Preempting a task changes the order in which tasks will be performed, it does not put the preempted task on hold in any other way.

Preempt and Set Offline

Preempts the selected task and sets the worker offline. The task is interrupted and reassigned to another host and the worker is taken offline immediately.


Health Status

This section includes the following topics:

Health Status Overview

Working with the Health Status Monitor

Health Status Overview

Each host configured to function as part of the Cisco MXE 3500 is assigned tasks depending on the workers configured for that Host. The Health Status Monitor allows you to track the performance of these workers over time. See Figure 15-22.

Figure 15-22 Health Status Monitor

Each row in the Health Status Monitor reflects workers run on a particular host. The Host is listed in the column at the far left, and each block in the row shows statistics on an individual worker. Information about each worker is displayed in the worker blocks in two ways:

Color

Health Counter

Color

The color of the worker block indicates the general performance history, or health, of the worker on that particular host. Table 15-10 describes the job options.

Table 15-10 Timed Job Options and Descriptions

Job Option
Description

Green

Indicates a worker in good health. Workers that always complete tasks successfully will be displayed in green.

Yellow

Indicates a warning. Workers that complete the majority of tasks successfully, but do report failure on some tasks will be displayed in yellow. This indicates to the administrator that the worker is generally successful, but may need to be monitored if the number of failed tasks increases.

Red

Indicates a worker that requires attention. Workers that fail to complete tasks successfully more often than not will be displayed in red. This indicates to the administrator that the worker is not performing as expected and requires attention.

Brown

Indicates a worker that has been paused or set offline. Offline workers are unavailable to accept work assignments from the ECS. An offline worker displays the word Paused for the duration of the time that it remains offline.

Note Only users who have been granted Administration privileges in the User Administration page are able to set workers offline.


Health Counter

The values shown in the health counter reveal more detailed information about the performance of the worker. Where color gives a general reading of the health of the worker, the health counter reflects the exact number of times that the worker has failed to complete compared to the total number of times the worker has been run. The first number indicates the number of failures. The second number indicates the total number of times the worker has run since the last time the ECS was restarted. See Figure 15-23.

Figure 15-23 Health Counter

Working with the Health Status Monitor

The Options menu in the Health Status Monitor allows you to interact with workers. This menu can also be accessed by right-clicking any worker block in the list. See Figure 15-24. Table 15-11 describes the job options.

Figure 15-24 Health Status Monitor Options

Table 15-11 Timed Job Options and Descriptions  

Job Option
Description

Set Online

Sets an offline worker back online to resume work. The worker will return to an active state in which it is available to accept tasks assigned by the ECS. This will have no effect on a worker that is not offline.

Set Offline

Sets a worker offline, preventing it from receiving new task assignments. This can be used to bypass a worker experiencing a high rate of failure or to test and verify configuration changes. For testing, the administrator makes changes to a worker on a specific host and then sets all other instances of that worker offline. This forces the ECS to direct jobs to the desired host to verify the configuration change.

Note Setting a worker offline is typically used as a temporary measure during system tuning or troubleshooting until the administrator is able to isolate potential causes of failure.

Preempt

Stops all tasks of the type reflected in the selected Health Status block. For example, the block reporting the health of the Flash 8 encoder is preempted, all Flash 8 encoding tasks currently running will be preempted. Preempted tasks will remain in the queue and will be run when a resource is available.

Preempt and Set Offline

Stops tasks of the type reflected in the selected Health Status block and sets the worker offline so that it is unavailable to accept new tasks. Preempted tasks will remain in the queue and will be run when a resource is available.

Reset Counter

Temporarily resets the health counter ratio for the selected worker back to zero. This allows the administrator to watch new jobs as they are submitted to determine the rate of failure. This is useful mostly for troubleshooting a specific worker that is experiencing a high rate of failure on a particular host. The health counter will reflect the failure rate for total jobs since the ECS was rebooted once the administrator navigates away from the Status page.

Reset All Counters

Temporarily resets the health counter ratio for all workers back to zero. This allows the administrator to watch all new jobs as they are submitted to monitor the current performance of workers. This is useful during troubleshooting when current statistics are more useful than performance over time. The health counter will again reflect the failure rate for total jobs since the ECS was rebooted once the administrator navigates away from the Status page.

Note The total failure rate since the ECS was restarted is stored in the database. The reset option on this menu allows tracking of processed jobs while the page is open, independent of the recorded statistics in the database. Opening the page in a new window allows the administrator to switch back and forth between other sections of the interface and the Status page.