CppCMS 1.2 Input Content Filtering
Problem
CppCMS 1.0 does not allow to look at the content of the data being uploaded or posted to cppcms service. CppCMS application can access the data only once it is ready.
It can be problematic for big uploads as for example you may need to discard huge files once they are ready. Or it may be even more problematic when you need to allow different users different limits on uploaded data.
Content Filters
CppCMS 1.1 introduced input content filtering methods.
First of all asynchronous application need to be installed with special content_filter
flag, using new API.
srv.applications_pool().mount( cppcms::create_pool<my_upload>(), cppcms::mount_point("/uploads",0), cppcms::app::asynchronous | cppcms::app::content_filter );
Note: CppCMS 1.1 introduced new mount API that allows to mount both synchronous and asynchronous applications in unified way by providing flags.
When both cppcms::app::asynchronous
and cppcms::app::content_filter
are set the application's main function would be called twice if the content is not empty:
First time it is called when the request headers are ready, at this point application can adjust various security limits per user or decide if the user is allowed to actually upload the data at all.
At this point it is possible to install filters for on-progress data handing.
It is possible to distinguish such a state by checking if
request().is_ready()
returns false.Next time main function would be called when entire data is ready, and
request().is_ready()
returns true.
For example:
class my_upload: public cppcms::application { public: ... void main(std::string path) { if(!request().is_ready()) { session().load(); if(!session().is_set("user_id")) throw cppcms::http::abort_upload(403) int uid = session().get<int>("user_id"); long long len = file_size_limit_for_user(uid); request().limits().multipart_form_data_limit(len); } else { /// process data } };
Now at first call we check if the client is authorized to upload files at all before one transfers huge data.
If the he isn't we throw abort_upload
with HTTP status code 403 - forbidden. You can also write to output some meaningful message and throw abort_upload
exception.
Than you can modify various limits on per user bases. Next time main is called the request would be ready.
Note:
- it isn't necessary that these two calls of main() are executed one after other, same application handles multiple connections and you should not assume following calls are connected.
- At filtering stage cppcms::application does not own the context, i.e. you can't call release_context() and handle it.
Installing filters.
There are two useful filters derived from cppcms::http::basic_content_filter
multipart_content_filter
andraw_content_filter
, first one handles only ordinary file uploads and allow you to examine the file on the fly and the second one allows to handle raw input data as it comes.
For example this is example of application that handles huge PUT requests that can't be stored in memory.
First lets create a filter that handles raw data.
using cppcms::http::raw_content_filter; using cppcms::http::abort_upload; class save_file : public raw_content_filter { public: std::string name; bool remove_in_dtor; save_file(std::string n) : name(n), remove_in_dtor(true); { stream_.open(n.c_str()); if(!stream_) throw abort_upload(500); } ~save_file() { stream_.close(); if(remove_in_dtor) remove(name.c_str()); } // handle END of stream void on_end_of_content() { stream_.close(); if(!stream_) throw abort_upload(500); } // handle new data void on_data_chunk(void const *p,size_t n) { stream_.write(static_cast<char const *>(p),n); if(!stream_) throw abort_upload(500); } // handle case of peer disconnected void on_error() { stream_.close(); } private: std::ofstream stream_; };
It implements 3 virtual member functions and virtual destructor. They are quire self explaining.
Now lets look on our application:
void main(std::string path) { if( !request().is_ready() && request().method()=="PUT") { request().limits().content_length_limit(huge_limit); // note now request() owns the filter request().reset_content_filter( new save_file(get_new_path()) ); } else { save_file *ptr = dynamic_cast<safe_file *>request().filter(); if(!ptr) { // handle not upload } else { // do not delete file all ok ptr->remove_id_dtor=false; handle_file(ptr->name); } } }
If we get PUT request we install a filter that saves the input data to a file, one the content processing is completed main is called once again and we use the data generated in the filter.
This way PUT or POST requests of multiple GB can be safely handled.
Resubmitting Context
Asynchronous application can easily handle some basic filtering but if post processing requires some CPU intensive processing like video encoding than it can't be done in the asynchronous application itself.
CppCMS 1.1 introduced a convenient API to transfer context between application pools:
using cppcms::application_specific_pool; using cppcms::http::context; using booster::weak_ptr; using booster::shared_ptr; class myfilter: public cppcms::application { public: typedef shared_ptr<application_specific_pool> pool_ptr; typedef weak_ptr<application_specific_pool> weak_pool_ptr; myfilter(cppcms::service &srv, weak_pool_ptr ptr): cppcms::application(srv), wpool_(ptr) { .. } int main(std::sting path) { if(!request().is_ready()) { // handle all filtering stuff } else { pool_ptr pool = wpool_.lock(); if(pool) { shared_ptr<context> ctx = release_context(); // now this context goes to another // application to be run. ctx->submit_to_pool(pool,path); } } } ... private: weak_pool_ptr wpoll_; };
Note:
booster::shared_ptr<cppcms::application_specific_pool>
is returned by cppcms::create_pool function- We store reference to the pool as weak reference as during upload process cppcms::http::context actually owns cppcms::application instances. So it would prevent accidental creation of reference cycle.