• Tool transforms world landmark photos in

    From ScienceDaily@1337:3/111 to All on Wed Sep 9 21:30:40 2020
    Tool transforms world landmark photos into 4D experiences

    Date:
    September 9, 2020
    Source:
    Cornell University
    Summary:
    Using publicly available tourist photos of world landmarks such as
    the Trevi Fountain in Rome or Top of the Rock in New York City,
    researchers have developed a method to create maneuverable 3D
    images that show changes in appearance over time.



    FULL STORY ========================================================================== Using publicly available tourist photos of world landmarks such as the
    Trevi Fountain in Rome or Top of the Rock in New York City, Cornell
    University researchers have developed a method to create maneuverable
    3D images that show changes in appearance over time.


    ==========================================================================
    The method, which employs deep learning to ingest and synthesize tens
    of thousands of mostly untagged and undated photos, solves a problem
    that has eluded experts in computer vision for six decades.

    "It's a new way of modeling scenes that not only allows you to move
    your head and see, say, the fountain from different viewpoints, but also
    gives you controls for changing the time," said Noah Snavely, associate professor of computer science at Cornell Tech and senior author of "Crowdsampling the Plenoptic Function," presented at the European
    Conference on Computer Vision, held virtually Aug. 23-28.

    "If you really went to the Trevi Fountain on your vacation, the way it
    would look would depend on what time you went -- at night, it would
    be lit up by floodlights from the bottom. In the afternoon, it would
    be sunlit, unless you went on a cloudy day," Snavely said. "We learned
    the whole range of appearances, based on time of day and weather, from
    these unorganized photo collections, such that you can explore the whole
    range and simultaneously move around the scene." Representing a place
    in a photorealistic way is challenging for traditional computer vision,
    partly because of the sheer number of textures to be reproduced. "The
    real world is so diverse in its appearance and has different kinds of
    materials -- shiny things, water, thin structures," Snavely said.

    Another problem is the inconsistency of the available data. Describing how something looks from every possible viewpoint in space and time -- known
    as the plenoptic function -- would be a manageable task with hundreds of webcams affixed around a scene, recording data day and night. But since
    this isn't practical, the researchers had to develop a way to compensate.



    ========================================================================== "There may not be a photo taken at 4 p.m. from this exact viewpoint in
    the data set. So we have to learn from a photo taken at 9 p.m. at one
    location, and a photo taken at 4:03 from another location," Snavely
    said. "And we don't know the granularity of when these photos were
    taken. But using deep learning allows us to infer what the scene would
    have looked like at any given time and place." The researchers introduced
    a new scene representation called Deep Multiplane Images to interpolate appearance in four dimensions -- 3D, plus changes over time. Their method
    is inspired in part on a classic animation technique developed by the
    Walt Disney Company in the 1930s, which uses layers of transparencies
    to create a 3D effect without redrawing every aspect of a scene.

    "We use the same idea invented for creating 3D effects in 2D animation
    to create 3D effects in real-world scenes, to create this deep multilayer
    image by fitting it to all these disparate measurements from the tourists' photos," Snavely said. "It's interesting that it kind of stems from this
    very old, classic technique used in animation." In the study, they showed
    that this model could be trained to create a scene using around 50,000
    publicly available images found on sites such as Flickr and Instagram. The method has implications for computer vision research, as well as virtual tourism -- particularly useful at a time when few can travel in person.

    "You can get the sense of really being there," Snavely said. "It works surprisingly well for a range of scenes." First author of the paper is
    Cornell Tech doctoral student Zhengqi Li. Abe Davis, assistant professor
    of computer science in the Faculty of Computing and Information Science,
    and Cornell Tech doctoral student Wenqi Xian also contributed.

    The research was partly supported by philanthropist Eric Schmidt, former
    CEO of Google, and Wendy Schmidt, by recommendation of the Schmidt
    Futures Program.


    ========================================================================== Story Source: Materials provided by Cornell_University. Original written
    by Melanie Lefkowitz. Note: Content may be edited for style and length.


    ==========================================================================


    Link to news story: https://www.sciencedaily.com/releases/2020/09/200909100228.htm

    --- up 2 weeks, 2 days, 6 hours, 50 minutes
    * Origin: -=> Castle Rock BBS <=- Now Husky HPT Powered! (1337:3/111)