reflex::Input Class Reference

updated Mon Apr 10 2017 by Robert van Engelen
 
Classes | Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
reflex::Input Class Reference

Input character sequence class for unified access to sources of input text. More...

#include <input.h>

Collaboration diagram for reflex::Input:
Collaboration graph
[legend]

Classes

struct  file_encoding
 Common constants. More...
 

Public Member Functions

 Input (const Input &input)
 Copy constructor (with intended "move semantics" as internal state is shared, should not rely on using the rhs after copying). More...
 
 Input ()
 Construct empty input character sequence. More...
 
 Input (const char *cstring)
 Construct input character sequence from a NUL-terminated string. More...
 
 Input (const std::string &string)
 Construct input character sequence from a std::string. More...
 
 Input (const std::string *string)
 Construct input character sequence from a pointer to a std::string. More...
 
 Input (const wchar_t *wstring)
 Construct input character sequence from a NUL-terminated wide character string. More...
 
 Input (const std::wstring &wstring)
 Construct input character sequence from a std::wstring (may contain UTF-16 surrogate pairs). More...
 
 Input (const std::wstring *wstring)
 Construct input character sequence from a pointer to a std::wstring (may contain UTF-16 surrogate pairs). More...
 
 Input (FILE *file)
 Construct input character sequence from an open FILE* file descriptor, supports UTF-8 conversion from UTF-16 and UTF-32, use stdin if file == NULL. More...
 
 Input (FILE *file, unsigned short enc)
 Construct input character sequence from an open FILE* file descriptor, supports UTF-8 conversion from UTF-16 and UTF-32, use stdin if file == NULL. More...
 
 Input (std::istream &istream)
 Construct input character sequence from a std::istream. More...
 
 Input (std::istream *istream)
 Construct input character sequence from a pointer to a std::istream, use stdin if istream == NULL. More...
 
 operator const char * ()
 Cast this Input object to a string, returns NULL when this Input is not a string. More...
 
 operator const wchar_t * ()
 Cast this Input object to a wide character string, returns NULL when this Input is not a wide string. More...
 
 operator FILE * ()
 Cast this Input object to a file descriptor FILE*, returns NULL when this Input is not a FILE*. More...
 
 operator std::istream * ()
 Cast this Input object to a std::istream*, returns NULL when this Input is not a std::istream. More...
 
 operator bool ()
 
const char * cstring ()
 Get the remaining string of this Input object, returns NULL when this Input is not a string. More...
 
const wchar_t * wstring ()
 Get the remaining wide character string of this Input object, returns NULL when this Input is not a wide string. More...
 
FILE * file ()
 Get the FILE* of this Input object, returns NULL when this Input is not a FILE*. More...
 
std::istream * istream ()
 Get the std::istream of this Input object, returns NULL when this Input is not a std::istream. More...
 
size_t size ()
 Get the size of the input character sequence in number of ASCII/UTF-8 bytes (zero if size is not determinable from a FILE* or std::istream source). More...
 
bool assigned () const
 Check if this Input object was assigned a character sequence. More...
 
void clear ()
 Clear this Input by unassigning it. More...
 
bool good ()
 Check if input is available. More...
 
bool eof ()
 Check if input reached EOF. More...
 
size_t get (char *s, size_t n)
 Copy character sequence data into buffer. More...
 
void file_encoding (unsigned short enc)
 Set encoding for FILE* input. More...
 
unsigned short file_encoding () const
 Get encoding of the current FILE* input. More...
 

Protected Member Functions

void init ()
 Initialize the state after (re)setting the input source, auto-detects UTF BOM in FILE* input if the file size is known. More...
 
void file_init ()
 Implements init() on a FILE*. More...
 
size_t file_get (char *s, size_t n)
 Implements get() on a FILE*. More...
 
void file_size ()
 Implements size() on a FILE*. More...
 
bool file_good ()
 Implements good() operation on a FILE*. More...
 
bool file_eof ()
 Implements eof() on a FILE*. More...
 

Protected Attributes

const char * cstring_
 NUL-terminated char string input (when non-null) More...
 
const wchar_t * wstring_
 NUL-terminated wide string input (when non-null) More...
 
FILE * file_
 FILE* input (when non-null) More...
 
std::istream * istream_
 stream input (when non-null) More...
 
size_t size_
 size of the input in bytes, when known More...
 
char utf8_ [8]
 UTF-8 conversion buffer. More...
 
unsigned short uidx_
 index in utf8_[] or >= 8 when unused More...
 
unsigned short utfx_
 file_encoding More...
 

Detailed Description

Input character sequence class for unified access to sources of input text.

Description

The Input class unifies access to a source of input text that constitutes a sequence of characters:

Example

The following example shows how to use the Input class to read a character sequence in blocks from a std::ifstream to copy to stdout:

std::ifstream ifs;
ifs.open("input.h", std::ifstream::in);
reflex::Input input(ifs);
char buf[1024];
size_t len;
while ((len = input.get(buf, sizeof(buf))) > 0)
fwrite(buf, 1, len, stdout);
if (!input.eof())
std::cerr << "An IO error occurred" << std::endl;
ifs.close();

Example

The following example shows how to use the Input class to store the entire content of a file in a temporary buffer:

reflex::Input input(fopen("input.h", "r"));
if (input.file() == NULL)
abort();
size_t len = input.size(); // file size (minus any leading UTF BOM)
char *buf = new char[len];
input.get(buf, len);
if (!input.eof())
std::cerr << "An IO error occurred" << std::endl;
fwrite(buf, 1, len, stdout);
delete[] buf;
fclose(input.file());

In the above, files with UTF-16 and UTF-32 content are converted to UTF-8 by get(buf, len). Also, size() returns the total number of UTF-8 bytes to copy in the buffer by get(buf, len). The size is computed depending on the UTF-8/16/32 file content encoding, i.e. given a leading UTF BOM in the file. This means that UTF-16/32 files are read twice, first internally with size() and then again with get(buf, len)`.

Example

The following example shows how to use the Input class to read a character sequence in blocks from a file:

reflex::Input input(fopen("input.h", "r"));
char buf[1024];
size_t len;
while ((len = input.get(buf, sizeof(buf))) > 0)
fwrite(buf, 1, len, stdout);
fclose(input);

Example

The following example shows how to use the Input class to echo characters one by one from stdin, e.g. reading input from a tty:

reflex::Input input(stdin);
char c;
while (input.get(&c, 1))
fputc(c, stdout);

Example

The following example shows how to use the Input class to read a character sequence in blocks from a wide character string, converting it to UTF-8 to copy to stdout:

reflex::Input input(L"Copyright ©"); // © is unicode U+00A9 and UTF-8 C2 A9
char buf[8];
size_t len;
while ((len = input.get(buf, sizeof(buf))) > 0)
fwrite(buf, 1, len, stdout);

Example

The following example shows how to use the Input class to convert a wide character string to UTF-8:

reflex::Input input(L"Copyright ©"); // © is unicode U+00A9 and UTF-8 C2 A9
size_t len = input.size(); // size of UTF-8 string
char *buf = new char[len + 1];
input.get(buf, len);
buf[len] = '\0'; // make \0-terminated

Example

The following example shows how to switch source inputs while reading input byte by byte (use a buffer as shown in other examples to improve efficiency):

reflex::Input input = "Hello";
std::string message;
char c;
while (input.get(&c, 1))
message.append(c);
input = L" world! To ∞ and beyond."; // switch input to a wide string
while (input.get(&c, 1))
message.append(c);

Constructor & Destructor Documentation

reflex::Input::Input ( const Input input)
inline

Copy constructor (with intended "move semantics" as internal state is shared, should not rely on using the rhs after copying).

Parameters
inputan Input object to share state with (undefined behavior results from using both objects)
reflex::Input::Input ( )
inline

Construct empty input character sequence.

reflex::Input::Input ( const char *  cstring)
inline

Construct input character sequence from a NUL-terminated string.

Parameters
cstringNUL-terminated char* string
reflex::Input::Input ( const std::string &  string)
inline

Construct input character sequence from a std::string.

Parameters
stringinput string
reflex::Input::Input ( const std::string *  string)
inline

Construct input character sequence from a pointer to a std::string.

Parameters
stringinput string
reflex::Input::Input ( const wchar_t *  wstring)
inline

Construct input character sequence from a NUL-terminated wide character string.

Parameters
wstringNUL-terminated wchar_t* input string
reflex::Input::Input ( const std::wstring &  wstring)
inline

Construct input character sequence from a std::wstring (may contain UTF-16 surrogate pairs).

Parameters
wstringinput wide string
reflex::Input::Input ( const std::wstring *  wstring)
inline

Construct input character sequence from a pointer to a std::wstring (may contain UTF-16 surrogate pairs).

Parameters
wstringinput wide string
reflex::Input::Input ( FILE *  file)
inline

Construct input character sequence from an open FILE* file descriptor, supports UTF-8 conversion from UTF-16 and UTF-32, use stdin if file == NULL.

Parameters
fileinput file
reflex::Input::Input ( FILE *  file,
unsigned short  enc 
)
inline

Construct input character sequence from an open FILE* file descriptor, supports UTF-8 conversion from UTF-16 and UTF-32, use stdin if file == NULL.

Parameters
fileinput file
encfile_encoding (when UTF BOM is not present)
reflex::Input::Input ( std::istream &  istream)
inline

Construct input character sequence from a std::istream.

Parameters
istreaminput stream
reflex::Input::Input ( std::istream *  istream)
inline

Construct input character sequence from a pointer to a std::istream, use stdin if istream == NULL.

Parameters
istreaminput stream

Member Function Documentation

bool reflex::Input::assigned ( ) const
inline

Check if this Input object was assigned a character sequence.

Returns
true if this Input object was assigned (not default constructed or cleared).
void reflex::Input::clear ( )
inline

Clear this Input by unassigning it.

const char* reflex::Input::cstring ( )
inline

Get the remaining string of this Input object, returns NULL when this Input is not a string.

Returns
remaining unbuffered part of the NUL-terminated string or NULL.
bool reflex::Input::eof ( )
inline

Check if input reached EOF.

Returns
true if input is at EOF and no characters are available.
FILE* reflex::Input::file ( )
inline

Get the FILE* of this Input object, returns NULL when this Input is not a FILE*.

Returns
pointer to current file descriptor or NULL.
void reflex::Input::file_encoding ( unsigned short  enc)
unsigned short reflex::Input::file_encoding ( ) const
inline
bool reflex::Input::file_eof ( )
inlineprotected

Implements eof() on a FILE*.

size_t reflex::Input::file_get ( char *  s,
size_t  n 
)
protected

Implements get() on a FILE*.

Parameters
spoints to the string buffer to fill with input
nsize of buffer pointed to by s
bool reflex::Input::file_good ( )
inlineprotected

Implements good() operation on a FILE*.

void reflex::Input::file_init ( )
protected

Implements init() on a FILE*.

void reflex::Input::file_size ( )
protected

Implements size() on a FILE*.

size_t reflex::Input::get ( char *  s,
size_t  n 
)
inline

Copy character sequence data into buffer.

Returns
the nonzero number of (less or equal to n) 8-bit characters added to buffer s from the current input, or zero when EOF.
Parameters
spoints to the string buffer to fill with input
nsize of buffer pointed to by s
bool reflex::Input::good ( )
inline

Check if input is available.

Returns
true if a non-empty sequence of characters is available to get.
void reflex::Input::init ( )
inlineprotected

Initialize the state after (re)setting the input source, auto-detects UTF BOM in FILE* input if the file size is known.

std::istream* reflex::Input::istream ( )
inline

Get the std::istream of this Input object, returns NULL when this Input is not a std::istream.

Returns
pointer to current std::istream or NULL.
reflex::Input::operator bool ( )
inline
Returns
true if a non-empty sequence of characters is available to get.
reflex::Input::operator const char * ( )
inline

Cast this Input object to a string, returns NULL when this Input is not a string.

Returns
remaining unbuffered part of the NUL-terminated string or NULL.
reflex::Input::operator const wchar_t * ( )
inline

Cast this Input object to a wide character string, returns NULL when this Input is not a wide string.

Returns
remaining unbuffered part of the NUL-terminated wide character string or NULL.
reflex::Input::operator FILE * ( )
inline

Cast this Input object to a file descriptor FILE*, returns NULL when this Input is not a FILE*.

Returns
pointer to current file descriptor or NULL.
reflex::Input::operator std::istream * ( )
inline

Cast this Input object to a std::istream*, returns NULL when this Input is not a std::istream.

Returns
pointer to current std::istream or NULL.
size_t reflex::Input::size ( )
inline

Get the size of the input character sequence in number of ASCII/UTF-8 bytes (zero if size is not determinable from a FILE* or std::istream source).

Returns
the nonzero number of ASCII/UTF-8 bytes available to read, or zero when source is empty or if size is not determinable.
Warning
This function SHOULD NOT be used after get() as the "cursor" has moved it changes the result.
const wchar_t* reflex::Input::wstring ( )
inline

Get the remaining wide character string of this Input object, returns NULL when this Input is not a wide string.

Returns
remaining unbuffered part of the NUL-terminated wide character string or NULL.

Member Data Documentation

const char* reflex::Input::cstring_
protected

NUL-terminated char string input (when non-null)

FILE* reflex::Input::file_
protected

FILE* input (when non-null)

std::istream* reflex::Input::istream_
protected

stream input (when non-null)

size_t reflex::Input::size_
protected

size of the input in bytes, when known

unsigned short reflex::Input::uidx_
protected

index in utf8_[] or >= 8 when unused

char reflex::Input::utf8_[8]
protected

UTF-8 conversion buffer.

unsigned short reflex::Input::utfx_
protected
const wchar_t* reflex::Input::wstring_
protected

NUL-terminated wide string input (when non-null)


The documentation for this class was generated from the following file: