1. text

    YAML formats are not lossless

    Recently I was using YAML files to package up content to send across the wire. A part of this included a checksum and signature of the contents to ensure that it wasn’t modified, and that it came from a trusted source.

    I ran into an issue where some of the content wouldn’t sign properly on the receiving side. A lot of digging later turned up that parsing yaml content in ruby with YAML::load will strip empty lines, depending on what type of quotes you use when writing the file. WHAT?

    require 'yaml'
    File.open('strip.yaml','w') do |f|
      f << {:content => "\ncontent\n    \nmore content\n"}.to_yaml
    end
    data = YAML::load IO.read('strip.yaml')
    puts data
    
    # this will produce:
    {:content=>"\ncontent\n\nmore content\n"}
    

    As you can see the empty spaces on the second line were stripped out. Doing a signature check on the client is disastrous if the trusted source signed it when the content had those missing white spaces.

    However, doing a little tinkering along with the help of VI’s syntax highlighting I found out that using single quotes around the content we are writing to file gives us completely different behavior:

    require 'yaml'
    File.open('strip.yaml','w') do |f|
      # notice how the content string uses single quotes now
      f << {:content => '\ncontent\n    \nmore content\n'}.to_yaml
    end
    data = YAML::load IO.read('strip.yaml')
    puts data
    
    # this will produce:
    {:content=>"\\ncontent\\n    \\nmore content\\n"}
    

    Notice here how the spaces are retained, but we also don’t have true new line characters either. So there are really two things at play. One, the type of quotes you use makes a difference (even if the content isn’t using the #{} operators). Two, YAML::load doesn’t always respect your data, especially lines with nothing but white space.

About

My name is Scott H. Conner. I've been writing software for over a decade, and I've dug myself into plenty of holes. You can learn more about me here.

Search