ClickHouse/src/Functions/byteSwap.cpp

#include <Functions/FunctionFactory.h>
#include <Functions/FunctionUnaryArithmetic.h>

namespace DB
{
namespace ErrorCodes
{
extern const int NOT_IMPLEMENTED;
}

namespace
{
template <typename T>
requires std::is_same_v<T, UInt8>
inline T byteSwap(T x)
{
    return x;
}

template <typename T>
requires std::is_same_v<T, UInt16>
inline T byteSwap(T x)
{
    return __builtin_bswap16(x);
}

template <typename T>
requires std::is_same_v<T, UInt32>
inline T byteSwap(T x)
{
    return __builtin_bswap32(x);
}

template <typename T>
requires std::is_same_v<T, UInt64>
inline T byteSwap(T x)
{
    return __builtin_bswap64(x);
}

template <typename T>
inline T byteSwap(T)
{
    throw Exception(ErrorCodes::NOT_IMPLEMENTED, "byteSwap() is not implemented for {} datatype", demangle(typeid(T).name()));
}

template <typename T>
struct ByteSwapImpl
{
    using ResultType = T;
    static constexpr const bool allow_string_or_fixed_string = false;
    static inline T apply(T x) { return byteSwap<T>(x); }

#if USE_EMBEDDED_COMPILER
    static constexpr bool compilable = false;
#endif
};

struct NameByteSwap
{
    static constexpr auto name = "byteSwap";
};
using FunctionByteSwap = FunctionUnaryArithmetic<ByteSwapImpl, NameByteSwap, true>;

}

template <>
struct FunctionUnaryArithmeticMonotonicity<NameByteSwap>
{
    static bool has() { return false; }
    static IFunction::Monotonicity get(const Field &, const Field &) { return {}; }
};

REGISTER_FUNCTION(ByteSwap)
{
    factory.registerFunction<FunctionByteSwap>(
        FunctionDocumentation{
            .description = R"(
Accepts an unsigned integer `operand` and returns the integer which is obtained by swapping the **endianness** of `operand` i.e. reversing the bytes of the `operand`.

Currently, this is implemented for UInt8, UInt16, UInt32 and UInt64.

**Example**

```sql
byteSwap(3351772109)
```

Result:

```result
┌─byteSwap(3351772109)─┐
│           3455829959 │
└──────────────────────┘
```

The above example can be worked out in the following manner:
1. First, convert the integer operand (base 10) to its equivalent hexadecimal interpretation (base 16) in big-endian format i.e. 3351772109 -> C7 C7 FB CD (4 bytes)
2. Then, reverse the bytes i.e. C7 C7 FB CD -> CD FB C7 C7
3. Finally, the convert the hexadecimal number back to an integer assuming big-endian i.e. CD FB C7 C7  -> 3455829959

Note that, in step#1, one can also choose to convert the operand to bytes in little-endian as long as one also assumes little-endian when converting back to integer in step#3.

One use-case of this function is reversing IPv4s:
```result
┌─toIPv4(3351772109)─┐
│ 199.199.251.205    │
└────────────────────┘

┌─toIPv4(byteSwap(3351772109))─┐
│ 205.251.199.199              │
└──────────────────────────────┘
```
)",
            .examples{
                {"8-bit", "SELECT byteSwap(54)", "54"},
                {"16-bit", "SELECT byteSwap(4135)", "10000"},
                {"32-bit", "SELECT byteSwap(3351772109)", "3455829959"},
                {"64-bit", "SELECT byteSwap(123294967295)", "18439412204227788800"},
            },
            .categories{"Mathematical"}},
        FunctionFactory::CaseInsensitive);
}

}
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`#include <Functions/FunctionFactory.h>`
			`#include <Functions/FunctionUnaryArithmetic.h>`

			`namespace DB`
			`{`
			`namespace ErrorCodes`
			`{`
			`extern const int NOT_IMPLEMENTED;`
			`}`

			`namespace`
			`{`
			`template <typename T>`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`requires std::is_same_v<T, UInt8>`
			`inline T byteSwap(T x)`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`{`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`return x;`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`}`

			`template <typename T>`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`requires std::is_same_v<T, UInt16>`
			`inline T byteSwap(T x)`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`{`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`return __builtin_bswap16(x);`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`}`

			`template <typename T>`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`requires std::is_same_v<T, UInt32>`
			`inline T byteSwap(T x)`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`{`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`return __builtin_bswap32(x);`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`}`

			`template <typename T>`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`requires std::is_same_v<T, UInt64>`
			`inline T byteSwap(T x)`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`{`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`return __builtin_bswap64(x);`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`}`

			`template <typename T>`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`inline T byteSwap(T)`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`{`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`throw Exception(ErrorCodes::NOT_IMPLEMENTED, "byteSwap() is not implemented for {} datatype", demangle(typeid(T).name()));`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`}`

			`template <typename T>`
			`struct ByteSwapImpl`
			`{`
			`using ResultType = T;`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`static constexpr const bool allow_string_or_fixed_string = false;`
			`static inline T apply(T x) { return byteSwap<T>(x); }`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00
			`#if USE_EMBEDDED_COMPILER`
			`static constexpr bool compilable = false;`
			`#endif`
			`};`

			`struct NameByteSwap`
			`{`
			`static constexpr auto name = "byteSwap";`
			`};`
Address a few review comments. - Consider byteswap injective. - Make function case-insensitive. - Add in-code documentation and copy-paste it to the markdown docs. 2023-10-07 23:05:07 +00:00			`using FunctionByteSwap = FunctionUnaryArithmetic<ByteSwapImpl, NameByteSwap, true>;`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00
			`}`

			`template <>`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`struct FunctionUnaryArithmeticMonotonicity<NameByteSwap>`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`{`
Implement byteswap for following dtypes. UInt[8\|16\|32\|64] TODOs: - Improve NOT_IMPLEMENTED error message. - Add implementation for FixedStrings (reverse the bytes). - See whether this needs to be implemented for UInt[128\|256] and signed integers as well. 2023-10-02 17:50:56 +00:00			`static bool has() { return false; }`
			`static IFunction::Monotonicity get(const Field &, const Field &) { return {}; }`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`};`

			`REGISTER_FUNCTION(ByteSwap)`
			`{`
Address a few review comments. - Consider byteswap injective. - Make function case-insensitive. - Add in-code documentation and copy-paste it to the markdown docs. 2023-10-07 23:05:07 +00:00			`factory.registerFunction<FunctionByteSwap>(`
			`FunctionDocumentation{`
			`.description = R"(`
			Accepts an unsigned integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`.

			`Currently, this is implemented for UInt8, UInt16, UInt32 and UInt64.`

			`Example`

			```sql
			`byteSwap(3351772109)`
			```

			`Result:`

			```result
			`┌─byteSwap(3351772109)─┐`
			`│ 3455829959 │`
			`└──────────────────────┘`
			```

			`The above example can be worked out in the following manner:`
			`1. First, convert the integer operand (base 10) to its equivalent hexadecimal interpretation (base 16) in big-endian format i.e. 3351772109 -> C7 C7 FB CD (4 bytes)`
			`2. Then, reverse the bytes i.e. C7 C7 FB CD -> CD FB C7 C7`
			`3. Finally, the convert the hexadecimal number back to an integer assuming big-endian i.e. CD FB C7 C7 -> 3455829959`

			`Note that, in step#1, one can also choose to convert the operand to bytes in little-endian as long as one also assumes little-endian when converting back to integer in step#3.`

			`One use-case of this function is reversing IPv4s:`
			```result
			`┌─toIPv4(3351772109)─┐`
			`│ 199.199.251.205 │`
			`└────────────────────┘`

			`┌─toIPv4(byteSwap(3351772109))─┐`
			`│ 205.251.199.199 │`
			`└──────────────────────────────┘`
			```
			`)",`
			`.examples{`
			`{"8-bit", "SELECT byteSwap(54)", "54"},`
			`{"16-bit", "SELECT byteSwap(4135)", "10000"},`
			`{"32-bit", "SELECT byteSwap(3351772109)", "3455829959"},`
			`{"64-bit", "SELECT byteSwap(123294967295)", "18439412204227788800"},`
			`},`
			`.categories{"Mathematical"}},`
			`FunctionFactory::CaseInsensitive);`
Add function `byteSwap`. byteSwap accepts an integer `operand` and returns the integer which is obtained by swapping the endianness of `operand` i.e. reversing the bytes of the `operand`. Issue: #54734 2023-09-24 19:58:00 +00:00			`}`

			`}`