FOSP
By Rahul Razz
Cvoid exit (int status) { __run_exit_handlers (status, &__exit_funcs, true, true); }
- After End of main() function , our program call
exit()-->__run_exit_handlers()
1. Inside exit_handlers
Cvoid attribute_hidden __run_exit_handlers (int status, struct exit_function_list **listp, bool run_list_atexit, bool run_dtors) { ... while (true) { ... } while (cur->idx > 0) { struct exit_function *const f = &cur->fns[--cur->idx]; const uint64_t new_exitfn_called = __new_exitfn_called; switch (f->flavor) { ..... } .... __libc_lock_unlock (__exit_funcs_lock); if (run_list_atexit) call_function_static_weak (_IO_cleanup);//This is our target function to trace for File stream flush operations.. _exit (status); }
Full exit.c code : exit.c
_IO_cleanup() purpose:
- This function is part of glibcβs internal I/O system (libio).
- It is called at program termination (via exit() or similar paths) to:
- lush all open standard I/O streams (stdout, stderr, file streams, etc.)
- Make sure any buffered data is written to files.
- Switch streams to unbuffered mode afterward.
Cint IOcleanup (void) { int result = IOflush_all (); IOunbuffer_all (); return result; }
int result = _IO_flush_all ();
- calls
_IO_flush_all(), which: - Iterates over all open FILE* objects.
- Flushes (writes out) any data still in their buffers.
- Returns a result code (typically 0 for success, non-zero for failure).
_IO_unbuffer_all ();
- This function iterates over all open FILE* streams and sets their buffering mode to unbuffered (like calling setbuf(stream, NULL) for each).
2. _IO_FILE_plus:
Before moving forward to _IO_flush_all we need to do some discussion on _IO_FILE_plus.....
What is _IO_FILE_plus ? ?
In user-level C code, you typically see streams as:
FILE *fp = fopen("data.txt", "w");
But internally in glibc, a FILE is implemented as a struct defined in source/libio/bits/types/struct_FILE.h
Cstruct _IO_FILE { int _flags; // File status flags (read/write/eof/error) char *_IO_read_ptr; // Current read pointer in the buffer char *_IO_read_end; // End of readable buffer char *_IO_read_base; // Start of readable buffer char *_IO_write_base; // Start of write buffer char *_IO_write_ptr; // Current write pointer char *_IO_write_end; // End of write buffer char *_IO_buf_base; // Base of allocated buffer (for read/write) char *_IO_buf_end; // End of allocated buffer char *_IO_save_base; // Backup of buffer base (used in ungetc) char *_IO_backup_base; // Backup buffer base char *_IO_save_end; // Backup buffer end struct _IO_marker *_markers; // Linked list of markers (used for positioning) struct _IO_FILE *_chain; // Next FILE in linked list of open streams int _fileno; // File descriptor (OS handle) int _flags2 : 24; // Extra flags for internal use char _short_backupbuf[1]; // Tiny backup buffer for special cases __off_t _old_offset; // Previous file offset (for seek operations) unsigned short _cur_column;// Current column number (for text streams) signed char _vtable_offset;// Offset of vtable pointer in object (0 for normal FILE) char _shortbuf[1]; // Tiny buffer for putc/ungetc _IO_lock_t *_lock; // Lock for thread-safe access __off64_t _offset; // Current file position (64-bit offset) // Wide character support struct _IO_codecvt *_codecvt; // Codecvt object for character conversion (wide char support) struct _IO_wide_data *_wide_data; // Buffer and state for wide-character I/O struct _IO_FILE *_freeres_list; // List of freed FILE objects (for cleanup) void *_freeres_buf; // Buffer used for freeing FILEs struct _IO_FILE **_prevchain; // Previous FILE in the global chain int _mode; // Stream orientation: 0 = undecided, >0 = wide, <0 = byte char _unused2[20]; // Padding / reserved for future use }; //Finally struct _IO_FILE_plus { FILE file; const struct _IO_jump_t *vtable; };
Do not get afraid of these whole entries π , for our exploit part we need to just focus more on these entries:
_chain_lock_wide_data_mode_IO_jump_t *vtable[This one is most important..]- Apart from the above entries we would need to understand some char * of
read,write,buf,save,backup..
So when we call fopen to open our file it basically do some initialization of these file struct like
Cfopen() βββ _IO_new_fopen() βββ _IO_new_file_fopen() βββ _IO_file_open() β does low-level open() syscall βββ _IO_file_init() β initializes vtable & buffering βββ returns _IO_FILE_plus object // this is our struct file
_chain:
Let's say you opened two files named file1.txt and file2.txt , then
On opening any of these file we receive a _IO_FILE_plus struct containing _fileno entries with the file descriptor returned by kernel i.e if i open file1.txt first then _fileno=3 and then open file2.txt then its _fileno=4.
Now , we will observe that _chain entries of both file would be different...
Before moving to _chain I would like to introduce you with a very famous pointer , I m calling him famous because It is a global pointer inside glibcβs libio layer named _IO_list_all.
_IO_list_all:
- It points to the head of a linked list of all currently active (open) FILE* streams.
- Itβs essential for process cleanup because it contain list of of all opened file pointer
- Initially it contain
_IO_list_all β _IO_2_1_stderr_ (fd=2), as we open any other file it is added in the head of_IO_list_alllike_IO_list_all β OurFile_pointer(fd=3)
In last point as I told you each new file opened is connected to head of _IO_list_all but what about previously connected file pointer and how _IO_list_all is going to connect those all files ??
These doubts will now connect us with the use of _chain entries because this _chain entries do nothing but contain the entries of pointer which was connected to the head of _IO_list_all before currently opened file Or we can say that entries of OurFile_pointer(fd=3)->_chain will be pointer to _IO_2_1_stderr_(fd=2) ,
_IO_2_1_stderr_(fd=2)->_chain = _IO_2_1_stdout_(fd=1)_IO_2_1_stdout_(fd=1)->_chain = _IO_2_1_stdin_(fd=0)_IO_2_1_stdin_(fd=0)->_chain = NULL
Observation:
FD(n)->_chain = FD(n-1)each file pointer_chaincontain previously opend file pointer
_lock:
To understand this entries , first think why do we need this one ? Since we are dealing with Files , which mean it has to do something with read and write also in this modern era we have very fast computers or CPU right ? These speed are due to multiple cpu or multiple threads , this is the case where we need to understand the importance of file operations under the condition of multiple thread who want to read or write the same file without any race condition.. So to avoid these race condition , we need to implement mutual exclusion or mutex locking system to avoid wrong result by locking our file resources to be used by only one thread at once and wait by others.
- These implementation to avoid race conditon is done by setting our
_lockwith the mutex object _lockis either set toNULLor writable
Cpwndbg> p *(pthread_mutex_t *)stdout->_lock $9 = { __data = { __lock = 0, ... }
- Thsese are mutex pointer in stdout
_wide_data:
Whenever you write C , python code ,etc. you generally follow ASCII character , nothing new in it .. But while you are chatting with someone , it is not necessary that you always type in ASCII , sometime you need to show your emotion with some emoji , but have you ever wondered how much emoji your phone have and How ASCII can represent more than 255+ emojis ?
Again we can't represent those emojis with just 0xff or 1 byte limited ASCII values we need something more to represent it .
There comes our _wide_data to manage those extra sized character.
_wide_datais a pointer to a separate structure that stores buffers, pointers, and state for wide-character I/O.- Regular char I/O (like fwrite) uses
_IO_write_base/_IO_write_ptr/_IO_buf_base - Wide wchar_t I/O uses
_wide_data->_IO_write_base/_IO_write_ptr/_IO_buf_base.
_mode:
By reading it someone might misinterpret it like mode of file for read, write, truncate, etc.. But
The _mode field does not represent read/write mode β It represents the character orientation of the file stream (whether it handles normal bytes or wide characters)
Now you can connect with the above _wide_data , how our regular I/O uses is using _IO_writ_base and wide mode uses _wide_data->_IO_write_base
_modeindicates whether the stream isbyte-oriented,wide-oriented, or not yet decided.- It helps glibc determine whether to use normal I/O buffers (
_IO_write_ptr,_IO_read_ptr) or wide-character buffers (_wide_data->_IO_write_ptr,_wide_data->_IO_read_ptr). - The _mode field is signed int:
0β orientation not yet determined (stream unused or undecided)>0β byte-oriented stream (used byprintf,fread, etc.)<0β wide-character-oriented stream (used byfwprintf,fgetwc, etc.)
_IO_jump_t *vtable:
This is the most important field if you want to understand how _IO_FILE_plus implements polymorphic behavior for all kinds of I/O operations.
vtable basically contain table of fuctions which would be called via _IO_OVERFLOW(fp, EOF); according to which function is using this file struct i.e when you call fwrite(fp): it is redirected to
fp->vtable->xsputn(fp, buf, n);
vtableis like a menucard of function that our File is allowed to do.- Without
vtable, glibc would need if/else checks for every stream type.
Cstruct _IO_jump_t { size_t __dummy; // placeholder, not used size_t __dummy2; // placeholder, not used _IO_finish_t __finish; // called when finishing stream (cleanup buffers) _IO_overflow_t __overflow; // called when writing to a full buffer _IO_underflow_t __underflow; // called when reading from empty buffer _IO_underflow_t __uflow; // called to read a single character _IO_pbackfail_t __pbackfail; // called when ungetc fails (pushing back char) _IO_xsputn_t __xsputn; // called to write n bytes (fwrite uses this) _IO_xsgetn_t __xsgetn; // called to read n bytes (fread uses this) _IO_seekoff_t __seekoff; // called to seek by offset (fseek) _IO_seekpos_t __seekpos; // called to seek to a specific position _IO_setbuf_t __setbuf; // called to set buffering mode (setvbuf) _IO_sync_t __sync; // called to flush buffers (fflush) _IO_doallocate_t __doallocate;// called to allocate internal buffer if needed _IO_read_t __read; // low-level read (OS read) _IO_write_t __write; // low-level write (OS write) _IO_seek_t __seek; // low-level seek (lseek wrapper) _IO_close_t __close; // low-level close (fclose wrapper) _IO_stat_t __stat; // get file status (fstat) _IO_showmanyc_t __showmanyc; // estimate number of characters available to read _IO_imbue_t __imbue; // set locale/encoding (for wide-char streams) };
3. Now move Inside _IO_flush_all()
After learning lots about File struct , we are now confident to understand the code below
What does _IO_flush_all do :
- Lock
_IO_list_all--> for thread-safety (Multiple threads might be writing to different streams; we donβt want to flush while someone else is modifying one.) - It walks the linked list
_IO_list_all - For each stream:
- Check if thereβs buffered data ( uses
_modeto identify , if we need to flush_wide charornormal bytes) - Flush via _IO_OVERFLOW(fp, EOF)
- Check if thereβs buffered data ( uses
- Handle errors (set result = EOF)
- Unlock global list
- Return success/failure
There is a new entry where you may feel new i.e. _IO_vtable_offset(fp) == 0 , This condition checks whether the FILE object fp is a standard/normal FILE stream, meaning its vtable pointer is located at the expected position (offset 0) in memory. If not 0 then our file stream is custom like FILE *fp = fmemopen(buf, sizeof(buf), "w"); , but we generally do not use these standard unless we requir more customize form..
So, As our current writup we would assume for standared file stream for open, fopen, etc.
Cint _IO_flush_all (void) { int result = 0; FILE *fp; #ifdef _IO_MTSAFE_IO _IO_cleanup_region_start_noarg (flush_cleanup); _IO_lock_lock (list_all_lock); //lock global list all #endif for (fp = (FILE *) _IO_list_all; fp != NULL; fp = fp->_chain) // started loop to scan all opened file pointer via the concurrent process { run_fp = fp; _IO_flockfile (fp); //lock the file to avoid race condition via another thread if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base) // checking for normal byte or not decided and then checking if we are in mid of writing or not , if we are then need to flush it before end of main thread || (_IO_vtable_offset (fp) == 0 //checking for standared stream file pointer && fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr > fp->_wide_data->_IO_write_base)) // again checking for any pending _wide_data (_mode>0) buffer ) && _IO_OVERFLOW (fp, EOF) == EOF) //do flush if any pending buffer ## This is our target now to explore.. result = EOF; _IO_funlockfile (fp); //unlock the file pointer to be used by another thread run_fp = NULL; } #ifdef _IO_MTSAFE_IO _IO_lock_unlock (list_all_lock); _IO_cleanup_region_end (0); #endif return result; }
Code:_IO_flush_all
4. _IO_OVERFLOW (fp, EOF)
In libioP.h , It defined as micros #define _IO_OVERFLOW(FP, CH) JUMP1 (__overflow, FP, CH) and for JUMP1 defined as
#define JUMP1(FUNC, THIS, X1) (_IO_JUMPS_FUNC(THIS)->FUNC) (THIS, X1)
Now we need to understand _IO_JUMPS_FUNC(THIS)
It is again defined as micro in libioP.h as :
C# define _IO_JUMPS_FUNC(THIS) \ (IO_validate_vtable \ (*(struct _IO_jump_t **) ((void *) &_IO_JUMPS_FILE_plus (THIS) \ + (THIS)->_vtable_offset)))
THISis a pointer to a FILE object (FILE *fp)_IO_JUMPS_FILE_plus(THIS), This is another macro/function (glibc internal) that gives the base memory address where the vtables for files are stored+ (THIS)->_vtable_offset, we already discussed it ; shoud be =0;_IO_jump_t *vtable = *(struct _IO_jump_t **)vtable_addr;assign*vtablethe base address offp->vtable.vtable = IO_validate_vtable(vtable);// this is an important step to verify the correctness ofvtablepointer.- Now
_IO_JUMPS_FUN(THIS)will be replaced byvtablepointer andJUMP1(THIS,X1)will call(vtable->FUNC)(THIS,X1) FUNCis offset ofvtablefunctions based on verstion of libc. I'm testing on libc. 2.4 whereFUN = 3for__overfloworcall [vtable+0x18]with 1 extra argumentx1according toJUMP1