6th December 2016, 17 min read

Don’t assume that safety comes for free: a Swift case study

19 thoughts on “Don’t assume that safety comes for free: a Swift case study”

Bartosz WÃ³jcik says:

December 6, 2016 at 11:02 pm

Exceptions on 64 bit overflows? I guess every crypto code written in Swift needs to use this “unsafe” notation (I wonder if the compiler throws countless warnings about it?). I’m from the old school of x86 assembly and I never like signed integers in HLL languages and even worse – lack of unsigned integers (Java), because to me and the hardware they are natural things, but this “invention” goes against the hardware CPU design, if it doesn’t trigger an exception in low level code why should it crash in high level? God damn future…
1. Daniel Lemire says:
  
  December 7, 2016 at 1:10 am
  
  Exceptions on 64 bit overflows?
  
  It is a runtime assert. The program crashes.
  
  I guess every crypto code written in Swift needs to use this â€œunsafeâ€ notation (I wonder if the compiler throws countless warnings about it?).
  
  The compiler won’t complain, this is at runtime. You can use “+” for checked additions, and the ampersand notation for unchecked overflow. If you do want unchecked overflows, then you have to use the ampersand notation.
  
  I’m from the old school of x86 assembly and I never like signed integers in HLL languages and even worse â€“ lack of unsigned integers (Java), because to me and the hardware they are natural things, but this â€œinventionâ€ goes against the hardware CPU design, if it doesn’t trigger an exception in low level code why should it crash in high level? God damn futureâ€¦
  
  You are right that there is a mismatch between hardware and software. See John Regher’s blog post (link at the bottom).
Ben says:

December 7, 2016 at 1:28 pm

In my fantasy world, a decade or two from now it will be mainstream for programs to be written in the more abstract style and for compilers to prove that fancy optimizations are safe (in the cases when they are actually safe, of course). This has been some people’s fantasy for at least a generation or two, but it feels like we’re actually getting closer to having this technology (see, e.g., seL4 and CompCert).

The difference between “safe” and safe is interesting. I would characterize it as the difference between avoiding certain implementation-level bad behaviors versus ensuring that an application behaves sensibly in terms of high-level behavior. The latter is much harder in general, but I still believe the former has value.

Your analogy with optical illusions feels a little off to me. Natural neural systems have tons of checks and balances to increase the likelihood of reasonable behavior. If software goes out of its intended execution path (for example via numerical overflow), it is much less likely that the result will be acceptable.
1. Daniel Lemire says:
  
  December 7, 2016 at 4:40 pm
  
  Your analogy with optical illusions feels a little off to me. Natural neural systems have tons of checks and balances to increase the likelihood of reasonable behavior.
  
  Good engineering is not about preventing faults at a high cost. It is about managing the damages when faults occur.
  
  Your software *will* fail. There is no getting around that. Thinking that “safe” software is software that does not fail is wrong.
  
  So what we do when problems occur matters a great deal. Is “falling dead” the right approach?
  1. Ben says:
    
    December 8, 2016 at 1:58 pm
    
    > Good engineering is not about preventing faults at a high cost. It is about managing the damages when faults occur.
    
    I think you’ve stated this too strongly. Good engineering practice involves both avoiding and managing problems. Even some of the “move fast and break things” zealots acknowledge that it’s possible to take that motto too far.
    
    > So what we do when problems occur matters a great deal. Is â€œfalling deadâ€ the right approach?
    
    Of course not, at least from a complete system perspective. But languages like C and C++ by default allow suspicious things like numerical overflow and out-of-bounds accesses to go by unnoticed. It’s hard to take any compensatory action when the system doesn’t notice there’s a problem.
    
    I have very little experience with Swift, and I think immediately killing the process might be too severe. But even that approach can be managed by having another process monitoring the application and restarting it as needed.
    1. Daniel Lemire says:
      
      December 8, 2016 at 2:33 pm
      
      I think you’ve stated this too strongly.
      
      Millions of people die in car accidents. We could prevent all of it if we built cars like tanks. We don’t all drive military-grade tanks because costs matter. We accept that there will be accidents and failures. If you design things with the assumption that you can make the faults go away, your costs will be out of this world or you’ll fool yourself, and when a fault does occur it will be terrible because unplanned for.
      
      The choice Swift makes is to indeed assume that there will be overflows, but then that when they occur, a crash is harmless enough.
      
      But languages like C and C++ by default allow suspicious things like numerical overflow and out-of-bounds accesses to go by unnoticed.
      
      With clang and GNU GCC, you can compile you C and C++ code with runtime checks:
      http://lemire.me/blog/2016/04/20/no-more-leaks-with-sanitize-flags-in-gcc-and-clang/
      
      I presume that most industrial-strength C and C++ compilers must have similar capabilities. Of course, it comes with a performance penalty.
Yathaid says:

December 7, 2016 at 6:58 pm

Well, this is kinda implementation specific. By implementation, I mean the compiler and the runtime provided by the language. There are arguments to be made in favor of Rust where the abstraction still does not pay a cost in runtime: https://ruudvanasseldonk.com/2016/11/30/zero-cost-abstractions
1. Daniel Lemire says:
  
  December 7, 2016 at 7:24 pm
  
  My blog post was about safety (overflow checks). I guess you refer to something else, which is the “cost” of abstraction (functional programming idioms in this case). Then, sure, I was surprised that the reduce approach was slower than the loop approach myself and I don’t think that the reason for the performance difference is all that clear.
  
  I would like you to consider the following points:
  
  1. Rust (in release mode) does not check for overflows unlike Swift. So it is not directly comparable with Swift.
  
  2. There are plenty of cases where Swift, Rust, JavaScript, C++, and Java, will have the same performance whether using loops or functional programming idioms… but this, by itself, does not prove that abstraction does not have a performance cost. There are definitively cases where functional programming idioms have a performance cost (as shown here).
Matthew Self says:

December 8, 2016 at 5:20 am

An Ariane V rocket was destroyed during takeoff because the guidance software (written in Ada) caused an overflow that caused the program to halt. The computation being performed wasn’t even needed for flight, but was running anyway.
Joe Groff says:

February 8, 2017 at 11:28 pm

We looked at your `reduce` example, and in the latest compiler, it appears that the for loop and reduce get essentially equivalent performance now:

Array_for: 0m1.675s user time

Array_reduce: 0m1.638s time

Thanks for the test case!
1. Daniel Lemire says:
  
  February 9, 2017 at 12:21 am
  
  Any chance you might look at my post Resizing arrays can be slow in Swift?
  1. Joe Groff says:
    
    February 9, 2017 at 12:40 am
    
    We do have a known issue that `isUniquelyReferenced` checks don’t get hoisted out of an `append` loop. That might account for part of the problem there. I’ll file a bug to make sure we investigate.
    1. Daniel Lemire says:
      
      February 9, 2017 at 1:14 am
      Thanks. My argument is that this should never be the fastest way to do things…
      
      func extendArray(_ x : inout [Int], size: Int) -> [Int] { var answer = Array(repeating: 0, count: size) for i in 0..<x.count { answer[i] = x[i] } x = answer }
      
      If that’s the fast way, then performance is left on the table.
      1. Joe Groff says:
        
        February 9, 2017 at 2:51 am
        
        By all means, that’s a deficiency in Swift’s optimizer. There is another way to initialize the array, using the initializer from another Sequence:
        
        x = Array((0..<size).lazy.map { $0 < x.count ? x[i] : 0 })
        
        which will avoid the initial cost of bzero-ing the array buffer, but isn't always the clearest way to express the initialization.
        
        Daniel Lemire says:
        
        February 9, 2017 at 3:35 am
        
        I think that this syntax x = Array((0..<size).lazy.map { $0 < x.count ? x[$0] : 0 }) in Swift is very elegant. I’ll remember it.
        
        Elegance aside, I am not sure it is necessarily very fast given the current state of Swift. Granted, in principle, it could be fast, but it would require lots of work from the compiler.
        
        This would make for a long exchange, but I am not sure that map/filter are well suited to high performance computing. They can be slower than simple old-fashioned procedural code because the latter is more transparent to the compiler… especially as you chain them.
        
        Anyhow.
        
        I find that your proposal is a tad slower than x += repeatElement(0,count:newcount), and that’s not the fastest way.
        
        They are both easily 3x slower than they should be.
        
        Joe Groff says:
        
        February 9, 2017 at 4:31 pm
        
        There’s no fundamental reason the `lazy` variations of `map` and `filter` should be slower than a loop. They’re fully inlinable and don’t produce temporary copies, only “views” over the underlying collection. It sounds like Xcode 8’s compiler had some problems optimizing away closure overhead, hence the perf difference you saw between `reduce` and the for loop, but those issues seem to be fixed in 8.3 beta 1. I just tried this, and consistently get ~17% better results with “B”:
        
        import Dispatch
        
        var x = Array(repeating: 1738, count: 1_000_000)
        var sink = x
        
        print(“A:”)
        print(dispatch_benchmark(1_000) {
        var answer = Array(repeating: 0, count: 2_000_000)
        for i in 0 ..< x.count {
        answer[i] = x[i]
        }
        // Prevent `answer` from being DCE'd
        sink = answer
        })
        
        print("B:")
        print(dispatch_benchmark(1_000) {
        var answer = Array((0..<2_000_000).lazy.map { $0 < x.count ? x[$0] : 0 })
        // Prevent `answer` from being DCE'd
        sink = answer
        })
        
        Perhaps neither is still as fast as it should be yet, but we're working on it!
        
        Daniel Lemire says:
        
        February 9, 2017 at 5:05 pm
        
        those issues seem to be fixed in 8.3 beta 1
        
        Excellent!
        
        Daniel Lemire says:
        
        February 9, 2017 at 6:06 pm
        
        By the way, please do not think that I am trying to take down Swift. I am a fan.
        
        Joe Groff says:
        
        February 9, 2017 at 9:58 pm
        
        No problem at all! I appreciate that you’re finding and raising all these performance issues; it gives us motivation to fix them!