ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-11-21 15:12:02 +00:00

History

Amos Bird 7c9bbe4c29 Neutrialize thinlto's memcpy libcall gen. (cherry picked from commit `8ffa4d395c`)		2020-10-01 22:22:32 +03:00
..
CMakeLists.txt	Revert "enable FastMemcpy properly"	2020-10-01 10:38:06 +03:00
FastMemcpy_Avx.c	Move FastMemcpy to contribs (#9219 )	2020-03-13 01:26:16 +03:00
FastMemcpy_Avx.h	Move FastMemcpy to contribs (#9219 )	2020-03-13 01:26:16 +03:00
FastMemcpy.c	Move FastMemcpy to contribs (#9219 )	2020-03-13 01:26:16 +03:00
FastMemcpy.h	fix ubsan final	2020-07-06 12:29:22 +03:00
LICENSE	Move FastMemcpy to contribs (#9219 )	2020-03-13 01:26:16 +03:00
memcpy_wrapper.c	Neutrialize thinlto's memcpy libcall gen.	2020-10-01 22:22:32 +03:00
README.md	Move FastMemcpy to contribs (#9219 )	2020-03-13 01:26:16 +03:00

Internal implementation of memcpy function.

It has the following advantages over libc-supplied implementation:

it is linked statically, so the function is called directly, not through a PLT (procedure lookup table of shared library);
it is linked statically, so the function can have position-dependent code;
your binaries will not depend on glibc's memcpy, that forces dependency on specific symbol version like memcpy@@GLIBC_2.14 and consequently on specific version of glibc library;
you can include memcpy.h directly and the function has the chance to be inlined, which is beneficial for small but unknown at compile time sizes of memory regions;
this version of memcpy pretend to be faster (in our benchmarks, the difference is within few percents).

Currently it uses the implementation from Linwei (skywind3000@163.com). Look at https://www.zhihu.com/question/35172305 for discussion.

Drawbacks:

only use SSE 2, doesn't use wider (AVX, AVX 512) vector registers when available;
no CPU dispatching; doesn't take into account actual cache size.

Also worth to look at: