Flink从入门到实战四[DataStream API]-21-Sink数据输出到Socket

我们来看一下,如何将Flink的计算结果输出到Socket。

使用Socket来接收Flink输出结果。
1、启动Socket
nc -lk 7777

2、编写Flink代码输出到Socket

package org.itzhimei.sink;

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.serialization.SerializationSchema;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;

import java.nio.charset.StandardCharsets;
import java.util.Arrays;

public class Sink_2_Socket {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);
        DataStreamSource<String> dataStreamSource = env.fromCollection(Arrays.asList(
                "hello flink",
                "hello java",
                "hello world",
                "test",
                "source",
                "collection"));

        SingleOutputStreamOperator<Tuple2<String, Integer>> tuple2SingleOutputStreamOperator = dataStreamSource.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String s, Collector<Tuple2<String, Integer>> out) throws Exception {
                String[] words = s.split(" ");
                for (String word : words) {
                    out.collect(new Tuple2<>(word, 1));
                }
            }
        });

        //定义Socket和序列化逻辑
        tuple2SingleOutputStreamOperator.writeToSocket("localhost", 7777, new SerializationSchema<Tuple2<String, Integer>>() {
            @Override
            public byte[] serialize(Tuple2<String, Integer> element) {
                String s = new String(element.f0 + ":" + element.f1)+"\n";
                return s.getBytes(StandardCharsets.UTF_8);
            }
        });
        env.execute();
    }
}


/* Socket 输出:
hello:1
flink:1
hello:1
java:1
你好:1
flink:1
hello:1
world:1
test:1
source:1
collection:1
 */

Sink输出数据的重点除了配置连接,再就是配置序列化逻辑了,一般的字符串输出直接使用SimpleStringSchema,而特殊的序列化则需要自定义。
Demo中我们使用SerializationSchema来自定义了一个二元组的序列化逻辑。

tuple2SingleOutputStreamOperator.writeToSocket("localhost", 7777, new SerializationSchema<Tuple2<String, Integer>>() {
            @Override
            public byte[] serialize(Tuple2<String, Integer> element) {
                String s = new String(element.f0 + ":" + element.f1)+"\n";
                return s.getBytes(StandardCharsets.UTF_8);
            }
        });