Multimedia User Interfaces

Discussion in 'Windows Media Player' started by adamsobieski, Aug 31, 2011.

  1. adamsobieski

    adamsobieski Guest


    Greetings. I would like to describe some new multimedia user interface
    features including spatiotemporal content selection, bookmarking,
    spatiotemporal zooming, structure-based navigation, and client-side
    text-based search into multimedia objects. This composition intends to
    succinctly introduce those ideas and to describe an illustrative usage
    scenario, video blogging.

    1. Spatiotemporal Content Selection

    Spatiotemporal content selection is meant as spatial multimedia
    selection, temporal multimedia selection or both. Spatial multimedia
    selection is meant, herein, as indicating rectangular regions on a
    video rendering surface. Temporal multimedia selection is meant as
    selecting temporal intervals of multimedia, perhaps on the timeline
    upon which the playhead moves. Users can gesture with keyboard, mouse,
    multitouch, voice or NUI to indicate spatial regions and temporal
    intervals of multimedia objects. It occurs that spatial, temporal and
    spatiotemporal regions of multimedia content can be identified by
    media fragments URI ( Extensible
    context menus are envisioned for spatial, temporal and spatiotemporal

    2. Bookmarking

    With regard to bookmarking, users can gesture to place a bookmark, or
    point of interest, at a point in a video. It is envisioned that
    bookmarking gestures result in bookmark objects being placed at the
    position of the playhead as the user gestures. After making use of
    more of or the entirety of a multimedia object, users can return to
    their indicated points of interest, or bookmarks, to, for example,
    select temporal intervals around each such point. Extensible context
    menus are envisioned for bookmark points.

    3. Spatiotemporal Zooming

    With spatiotemporal zooming, users can zoom, from media fragments, to
    containing spatial regions and/or temporal intervals or to the
    multimedia object that contains the region and/or interval.
    Spatiotemporal zooming can make use of tracks that accompany a video,
    for example zooming from a search result fragment to a chapter of a
    multimedia object that contains the search result media fragment.

    4. Structure-Based Navigation

    Beyond sequences of chapters are possible outlines or tree-based
    structures for multimedia objects. With such tracks, user interface
    implementation ideas include that buttons for chapter traversal can
    have menus for indicating the simultaneous traversal options from the
    current playhead position. Spatiotemporal zooming can combine with
    structure-based navigation to allow users to zoom from a media
    fragment to structural elements of the multimedia object that contain
    a media fragment. For example, a structural model could include books,
    parts, chapters, pages, paragraphs and sentences, and, from a media
    fragment, a user could zoom to a containing structural element, and
    then also navigate by means of those structural elements, based upon
    the particular structural model specified in a track.

    5. Client-side Text-based Search

    By making use of tracks that accompany a multimedia object or of
    client-side audio/video indexing and search, client-side text-based
    search into documents can include the option of searching into
    multimedia objects. Search results can be indicated by highlighted
    portions on the timeline or otherwise visually indicated, perhaps as
    per bookmarks. The finding of text string occurrences in documents can
    extend into multimedia objects contained in those documents.

    6. Usage Scenario: Video Blogging

    Video blogging is an illustrative usage scenario for the above
    multimedia user interface features. A video blogger makes use of a
    multimedia search engine for multimedia. Video fragments are indicated
    in the search results. The user watches a search result media fragment
    and decides that they are interested in seeing its entire video blog
    article. The user makes use of zooming to navigate to a containing
    section of or to the entire video blog article. As the user watches
    the other video blogger's video, they make use of bookmarking to place
    points of interest for later use. After watching the video blog, the
    user makes temporal selections around those bookmarked points, while
    perhaps making use of the structural data in one or more tracks of the
    video. The user then makes use of extensible context menus and
    utilizes the selected clips in a video authoring software to compose a
    video blog article with clips from one or more multimedia objects. It
    also occurs that, by making use of media fragment URI hyperlinks,
    users can additionally tweet about spatiotemporal selections of

    Other usage scenarios for the new multimedia user interface features
    include making use of video from political speeches, news, punditry,
    arts and entertainment, civil discourse, and arbitrary multimedia
    content, for example when tweeting, blogging or video blogging.

    Kind regards,

    Adam Sobieski
    adamsobieski, Aug 31, 2011
    1. Advertisements

  2. adamsobieski

    adamsobieski Guest

    Regarding those multimedia user interface ideas, here are some
    examples to clarify.

    Regarding point one, a spatial selection is selecting a rectangle of a
    video. By itself, a spatial selection is a subrectangle for the entire
    multimedia object's duration:,120,320,240

    A temporal selection is selecting, perhaps making use of the timeline,
    a portion or interval of a multimedia object. By itself, a temporal
    selection is for the entire movie's rectangle:,20

    Combining those, selecting a rectangle of the multimedia object and an
    interval of it, simultaneously, is a spatiotemporal selection:,120,320,240&t=10,20

    Point two, or bookmarking, is about placing points of interest on a
    multimedia object's timeline, for later use, without having to pause
    the multimedia user experience.

    Point three, observing the URI's for spatial, temporal and
    spatiotemporal media fragments, is about starting from one of those,
    as per <video src="
    video.avi#xywh=160,120,320,240&t=10,20"/>, and being able to navigate
    to either larger rectangles, wider intervals, both, or to the
    video.avi object. Zooming also includes from a media fragment, such as,20, to a containing structural
    element, for example

    Point four, videos also have tracks and, in such tracks, are possible
    structures beyond lists of chapters. Possible are structures like
    books, parts, chapters, pages, paragraphs and sentences. It is
    possible to select a structural element of a video.

    With such structural tracks, people can traverse multimedia objects in
    structure-based ways, as per from a point in a multimedia object to

    The fifth idea, is about client-side text searching into videos. Many
    document viewers and web browsers provide searching into documents for
    text occurances and that functionality is described as possible to
    extend into the multimedia objects in those documents. Client-side
    text-based multimedia search can be facilitated by processing the
    tracks that accompany multimedia objects, such as transcripts or
    captions, and by audio and natural language processing techniques.

    Kind regards,

    Adam Sobieski
    adamsobieski, Sep 2, 2011
    1. Advertisements

  3. adamsobieski

    adamsobieski Guest

    As convenient, after some discussions, the following summarizations
    have emerged:

    1. Selecting rectangles of video and intervals on video timelines.
    Selecting crop regions and timespans. Those selections can have
    context menus on them.

    2. Bookmarking. Placing points on the timeline of multimedia objects
    while watching them to then make later use of those bookmarked points.
    After indicating bookmark points, selections can be then made around
    or near those bookmark points.

    3. Selections of multimedia objects, media fragments, or clips, can be
    described by media fragments URI ( A
    spatial selection, as per a rectangle,,120,320,240
    , a temporal selection, as per an interval of the timeline,,20 , and a combination or
    spatiotemporal selection,120,320,240&t=10,20
    can be identified by URI and multimedia objects. Navigating from those
    to larger rectangles or wider intervals is point 3.

    4. Videos can, upcoming, contain more structure than lists of
    chapters. User interface ideas include being able to navigate through
    videos that have more structure than just chapters. For example, a
    video might include a track that describes books, parts, chapters,
    pages, paragraphs and sentences.

    5. By making use of tracks that accompany a multimedia object or of
    client-side audio/video indexing and search, client-side text-based
    search into multimedia is possible. Users can find text occurences in
    videos and navigate to them.

    Kind regards,

    Adam Sobieski
    adamsobieski, Sep 3, 2011
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.