Thursday, September 7, 2017

Portable way to read a file in C++ and handle possible errors

Leave a Comment

I want to do a simple thing: read a first line from a file, and do a proper error reporting in case there is no such file, no permission to read the file and so on.

I considered the following options:

  • std::ifstream. Unfortunately, there is no portable way to report system errors. Some other answers suggest checking errno after reading failed, but the standard does not guarantee that errno is set by any functions in iostreams library.
  • C style fopen/fread/fclose. This works, but is not as convenient as iostreams with std::getline. I'm looking for C++ solution.

Is there any way to accomplish this using C++14 and boost?

4 Answers

Answers 1

Disclaimer: I am the author of AFIO. But exactly what you are looking for is https://ned14.github.io/afio/ which is the v2 library incorporating the feedback from its Boost peer review in August 2015. Some relevant notes:

  • Portable to any conforming C++ 14 compiler with a working Filesystem TS in its STL (will make use of any Concepts TS if you have them too, and Coroutines TS support is in the works). If you can, you should enable C++ 17 for even better performance.
  • Provides view adapters into the Ranges TS, so ready for STL2.
  • Works on any recent Windows, Linux, OS X or FreeBSD.
  • Original error code is always preserved, even down to the original NT kernel error code if a NT kernel API was used.
  • Race free filesystem design used throughout (i.e. no TOCTOU).
  • Zero malloc, zero exception throw and zero whole system memory copy design used throughout, even down to paths (which can hit 64Kb!).
  • Expected to work well with sub-1us 4Kb i/o latency once such devices appear (Optane).
  • Works very well with the C++ standard library, and is intended to be proposed for standardisation into C++ in 2020 or thereabouts.

I will of course caveat that this is an alpha quality library, and you should not use it in production code. However, quite a few people already are doing so, indeed I've delivered some code for clients of mine written using AFIO and they were delighted with the performance, especially latencies at 99%.

Edit: it was suggested I add some code demonstrating use of AFIO to solve the OP's problem. Note that AFIO is a very low level library, hence you have to type a lot more code to achieve the same as iostreams, on the other hand you get no memory allocation, no exception throwing, no unpredictable latency spikes:

  // Try to read first line from file at path, returning no string if file does not exist,   // throwing exception for any other error   optional<std::string> read_first_line(filesystem::path path)   {     using namespace AFIO_V2_NAMESPACE;     // The result<T> is from WG21 P0762, it looks quite like an `expected<T, std::error_code>` object     // See Outcome v2 at https://ned14.github.io/outcome/ and https://lists.boost.org/boost-announce/2017/06/0510.php      // Open for reading the file at path using a null handle as the base     result<file_handle> _fh = file({}, path);     // If fh represents failure ...     if(!_fh)     {       // Fetch the error code       std::error_code ec = _fh.error();       // Did we fail due to file not found?       // It is *very* important to note that ec contains the *original* error code which could       // be POSIX, or Win32 or NT kernel error code domains. However we can always compare,       // via 100% C++ 11 STL, any error code to a generic error *condition* for equivalence       // So this comparison will work as expected irrespective of original error code.       if(ec == std::errc::no_such_file_or_directory)       {         // Return empty optional         return {};       }       std::cerr << "Opening file " << path << " failed with " << ec.message() << std::endl;     }     // If errored, result<T>.value() throws an error code failure as if `throw std::system_error(fh.error());`     // Otherwise unpack the value containing the valid file_handle     file_handle fh(std::move(_fh.value()));     // Configure the scatter buffers for the read, ideally aligned to a page boundary for DMA     alignas(4096) char buffer[4096];     // There is actually a faster to type shortcut for this, but I thought best to spell it out     file_handle::buffer_type reqs[] = {{buffer, sizeof(buffer)}};     // Do a blocking read from offset 0 possibly filling the scatter buffers passed in     file_handle::io_result<file_handle::buffers_type> _buffers_read = read(fh, {reqs, 0});     if(!_buffers_read)     {       std::error_code ec = _fh.error();       std::cerr << "Reading the file " << path << " failed with " << ec.message() << std::endl;     }     // Same as before, either throw any error or unpack the value returned     file_handle::buffers_type buffers_read(_buffers_read.value());     // Note that buffers returned by AFIO read() may be completely different to buffers submitted     // This lets us skip unnecessary memory copying      // Make a string view of the first buffer returned     string_view v(buffers_read[0].data, buffers_read[0].len);     // Sub view that view with the first line     string_view line(v.substr(0, v.find_first_of('\n')));     // Return a string copying the first line from the file, or all 4096 bytes read if no newline found.     return std::string(line);   } 

Answers 2

The best thing to do could be to wrap Boost WinAPI and or POSIX APIs.

The "naive" C++ standard library thing (with bells and wistles) doesn't get you too far:

Live On Coliru

#include <iostream> #include <fstream> #include <vector>  template <typename Out> Out read_file(std::string const& path, Out out) {     std::ifstream s;     s.exceptions(std::ios::badbit | std::ios::eofbit | std::ios::failbit);     s.open(path, std::ios::binary);      return out = std::copy(std::istreambuf_iterator<char>{s}, {}, out); }  void test(std::string const& spec) try {     std::vector<char> data;     read_file(spec, back_inserter(data));      std::cout << spec << ": " << data.size() << " bytes read\n"; } catch(std::ios_base::failure const& f) {     std::cout << spec << ": " << f.what() << " code " << f.code() << " (" << f.code().message() << ")\n"; } catch(std::exception const& e) {     std::cout << spec << ": " << e.what() << "\n"; };  int main() {      test("main.cpp");     test("nonexistent.cpp");  } 

Prints...:

main.cpp: 823 bytes read nonexistent.cpp: basic_ios::clear: iostream error code iostream:1 (iostream error) 
  1. Of course you can add more diagnostics perusing <filesystem> but that's susceptible to races, as mentioned (depending on your application, these can even open up security vulnerabilities, so just say "No").

  2. Using boost::filesystem::ifstream doesn't change the exceptions raised

  3. Worse still, using Boost IOstream fails to raise any errors:

    template <typename Out> Out read_file(std::string const& path, Out out) {     namespace io = boost::iostreams;     io::stream<io::file_source> s;     s.exceptions(std::ios::badbit | std::ios::eofbit | std::ios::failbit);     s.open(path, std::ios::binary);      return out = std::copy(std::istreambuf_iterator<char>{s}, {}, out); } 

    Happily prints:

    main.cpp: 956 bytes read nonexistent.cpp: 0 bytes read 

    Live On Coliru

Answers 3

#include <iostream> #include <fstream> #include <string> #include <system_error>  using namespace std;  int main() {     ifstream f("testfile.txt");     if (!f.good()) {         error_code e(errno, system_category());         cerr << e.message();         //...     }     // ... } 

ISO C++ Standard:

The contents of the header "cerrno" are the same as the POSIX header "errno.h" , except that errno shall be defined as a macro. [ Note: The intent is to remain in close alignment with the POSIX standard. — end note ] A separate errno value shall be provided for each thread.

Answers 4

check this code:

uSTL is a partial implementation of the C++ standard library that focuses on decreasing the memory footprint of user executables.

https://github.com/msharov/ustl/blob/master/fstream.cc

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment