Project

General

Profile

Actions

Backport #5233

closed

OpenSSL::SSL::SSLSocket has problems with encodings other than "ascii"

Added by niklasb (Niklas Baumstark) over 12 years ago. Updated about 12 years ago.


Description

The attached script shows the issue. It expects a combined cert/private key in the file server.pem under the current directory (also attached).
Under Ruby 1.9.2p290, the script prints "str.size is 50848 (expecting 100000)".
As a workaround the string encoding can be forced to "ascii" before the write.


Files

openssl_write_failure.rb (892 Bytes) openssl_write_failure.rb A sample script showing the problem niklasb (Niklas Baumstark), 08/26/2011 08:45 PM
server.pem (1.62 KB) server.pem A self-signed certificate file for executing the script niklasb (Niklas Baumstark), 08/26/2011 08:45 PM

Updated by MartinBosslet (Martin Bosslet) over 12 years ago

  • Category set to ext
  • Status changed from Open to Assigned
  • Assignee set to MartinBosslet (Martin Bosslet)
  • Target version set to 1.9.4

Updated by MartinBosslet (Martin Bosslet) over 12 years ago

The problem is in lib/openssl/buffering.rb:

def do_write(s)
  @wbuffer = "" unless defined? @wbuffer
  @wbuffer << s
  @sync ||= false
  if @sync or @wbuffer.size > BLOCK_SIZE or idx = @wbuffer.rindex($/)
    remain = idx ? idx + $/.size : @wbuffer.length
    nwritten = 0
    while remain > 0
      str = @wbuffer[nwritten,remain]
      begin
        nwrote = syswrite(str)
      rescue Errno::EAGAIN
        retry
      end
      remain -= nwrote
      nwritten += nwrote
    end
    @wbuffer[0,nwritten] = ""
  end
end

remain gets initialized with @wbuffer.length, the string length in characters,
but nwrote receives the actual number of bytes written, so less bytes than
actually available are written.

A fix for this would be treating @wbuffer strictly as binary data by forcing
its encoding to BINARY. I'm not sure, does anyone see a more elegant way or
would this solution suffice?

Updated by normalperson (Eric Wong) over 12 years ago

Martin Bosslet wrote:

The problem is in lib/openssl/buffering.rb:

def do_write(s)
  @wbuffer = "" unless defined? @wbuffer
  @wbuffer << s
  @sync ||= false
  if @sync or @wbuffer.size > BLOCK_SIZE or idx = @wbuffer.rindex($/)
    remain = idx ? idx + $/.size : @wbuffer.length
    nwritten = 0
    while remain > 0
      str = @wbuffer[nwritten,remain]
      begin
        nwrote = syswrite(str)
      rescue Errno::EAGAIN
        retry
      end
      remain -= nwrote
      nwritten += nwrote
    end
    @wbuffer[0,nwritten] = ""
  end
end

remain gets initialized with @wbuffer.length, the string length in characters,
but nwrote receives the actual number of bytes written, so less bytes than
actually available are written.

A fix for this would be treating @wbuffer strictly as binary data by forcing
its encoding to BINARY. I'm not sure, does anyone see a more elegant way or
would this solution suffice?

I use an "-- encoding: binary --" comment at the top of all Ruby
source files where I initialize string literals for storing binary data.
It's cleaner than setting Encoding::BINARY on every string I create
(and nearly all my code works exclusively on binary data).

Also, all of the Ruby (non-SSL) *Socket objects have Encoding::BINARY by
default anyways, so I think SSLSocket should be the same.

Updated by MartinBosslet (Martin Bosslet) over 12 years ago

I use an "-- encoding: binary --" comment at the top of all Ruby
source files where I initialize string literals for storing binary data.
It's cleaner than setting Encoding::BINARY on every string I create
(and nearly all my code works exclusively on binary data).

I'm afraid this had no effect, or I did it wrong, or I might also have
misunderstood you. The incoming string s already has UTF-8 encoding, so

@wbuffer << s

ends up as UTF-8 regardless of the encoding I set for the .rb file, I
figured this was because "<<" calls rb_str_append which again calls
rb_enc_check which will determine a compatible encoding, in this case
UTF-8, for @wbuffer. But again, I might have misunderstood you.

Also, all of the Ruby (non-SSL) *Socket objects have Encoding::BINARY by
default anyways, so I think SSLSocket should be the same.

I'm sorry, I don't understand what you mean by the *Socket objects have
binary encoding by default - do you mean it's binary data they are expecting
to deal with for input and output? So a user would have to make sure to only
pass already BINARY-encoded strings to any *Socket?

I quickly checked with a TCPServer and Net::HTTP client, there the aforementioned
situation would work, when sending 100000 a-Umlauts you again receive the same
amount, after enforcing the response to UTF-8 again, of course. That's why I
thought that an SSLSocket should behave the same way.

Updated by normalperson (Eric Wong) over 12 years ago

Martin Bosslet wrote:

I use an "-- encoding: binary --" comment at the top of all Ruby
source files where I initialize string literals for storing binary data.
It's cleaner than setting Encoding::BINARY on every string I create
(and nearly all my code works exclusively on binary data).

I'm afraid this had no effect, or I did it wrong, or I might also have
misunderstood you. The incoming string s already has UTF-8 encoding, so

@wbuffer << s

ends up as UTF-8 regardless of the encoding I set for the .rb file, I
figured this was because "<<" calls rb_str_append which again calls
rb_enc_check which will determine a compatible encoding, in this case
UTF-8, for @wbuffer. But again, I might have misunderstood you.

You're right. rb_str_append() modifies the empty @wbuffer to the
encoding of "s" above :(

I suppose calling @wbuffer.force_encoding(Encoding::BINARY) after
@wbuffer is necessary (unless you write the buffering code in C like
io.c does).

Also, all of the Ruby (non-SSL) *Socket objects have Encoding::BINARY by
default anyways, so I think SSLSocket should be the same.

I'm sorry, I don't understand what you mean by the *Socket objects have
binary encoding by default - do you mean it's binary data they are expecting
to deal with for input and output? So a user would have to make sure to only
pass already BINARY-encoded strings to any *Socket?

For all newly-created *Socket objects, external_encoding is already
ASCII-8BIT (binary) and the sockets should just pass the byte buffer of
any underlying String objects given to it.

I quickly checked with a TCPServer and Net::HTTP client, there the aforementioned
situation would work, when sending 100000 a-Umlauts you again receive the same
amount, after enforcing the response to UTF-8 again, of course. That's why I
thought that an SSLSocket should behave the same way.

Yes, underlying IO#read/read_nonblock/sysread for the TCPSocket objects
should return new ASCII-8BIT Strings. You needed to force them to UTF-8
yourself upon receipt.

Actions #6

Updated by Anonymous over 12 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r33485.
Niklas, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • lib/openssl/buffering.rb: Force multi-byte strings to be treated as
    binary data.
  • test/openssl/test_ssl.rb: Add test for it.

Thanks to Niklas Baumstark for reporting the issue!

[Ruby 1.9 - Bug #5233] [ruby-core:39120]

Updated by MartinBosslet (Martin Bosslet) over 12 years ago

Also thanks to Eric, for providing his thoughts on the topic!

Actions #8

Updated by naruse (Yui NARUSE) about 12 years ago

  • Tracker changed from Bug to Backport
  • Project changed from Ruby master to Backport193
  • Category deleted (ext)
  • Target version deleted (1.9.4)

Updated by naruse (Yui NARUSE) about 12 years ago

  • Status changed from Closed to Assigned

This breaks tests on CentOS 5.6 (but not Ubuntu 10.04, FreeBSD 8, 9).
http://c5664.rubyci.org/~chkbuild/ruby-1.9.3/log/20120210T173209Z.diff.html.gz

Maybe more some commits are needed.

Updated by naruse (Yui NARUSE) about 12 years ago

  • Status changed from Assigned to Closed

Backported r33508 and fixed this.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0