Glue封装了PySpark和Spark SQL
PySpark Select columns
DataSource0.count()
DataSource0.printSchema()
df = DataSource0.toDF()
找到value column中含数字字母的
df.filter(df['value'].rlike('\w+')).show()
找到value column中只含数字字母的
df.filter(df['value'].rlike('^a-zA-Z\d\s:') == False).show()
No comments:
Post a Comment