Characters and Strings in Go Language

neotam Avatar

Characters and Strings in Go Language
Posted on :
,

Go Programming language has two data types rune and string that are used to define characters and strings in go respectively

Characters in Go

Characters are runes in go language, rune literals are expressed in single quotes (‘) . Where, rune literal can be any valid Unicode character, Unicode code point or byte value . In fact rune is an alias of data type int32 so it is meant to represent or store Unicode code point.

Valid rune literals or characters are

'h'
'ఇ'
'\t'
'\v'
'\x56'
'\u0c05'
'\U0001f308'

There are two ways to represent Unicode code points, using escape sequences ‘\uxxxx’ or ‘\Uxxxxxxxx’.

‘\uxxxx’Represents character with 16-bit hex value. \u must be followed by exactly 4 hexadecimal digits
‘\Uxxxxxxxx’Represents a character with 32-bit hex value. \U must be followed by exactly 8 hexadecimal digits

Two other ways to represent a numeric value of a character using hexadecimal and octal values is as follows

\ooo\ followed by exactly 3 octal digits representing a value from 0 to 255
\xhh\x followed by exactly two hexadecimal digits representing a decimal value from 0 to 255

Strings in Go

Strings are sequence of characters that are defined either using double quotes(“) or back tick (). Strings are immutable types and support indexing. Length of a string can be obtained using built-in function len

Two forms of string literals

"Hello Go!"Interpreted string literals: Single line strings, all special characters like \t \n will their meaning inteact
hello
Go!
Raw string literal: Multi-line, special characters like \n are escaped \\n

Strings are implicitly UTF-8 encoded

Valid String literals in Go

"hello G!" 

Hello 
Go!           // Raw String, Multi-line 

\n
\n`   // Same as "\\n\n\\n"

"여보세요"  // UTF-8 Text 
"\xf0\x9f\x8c\x88"   // UTF-8 Bytes. Strings in Go are implicitly UTF-8 encoded, same as "🌈"
"\U0001F304 \U0001f308 \U0001F304"  //Explicit UTF-32 Unicode code points -> "🌄 🌈 🌄"
"\u05D2\u05D3" // Explicit UTF-16 Unicode code point -> "גד"


To deal with UTF-16 encoding(encode/decode) use package "unicode/utf16"

Slice expression can be used to extract substring from given string. Syntax of slice

s[start : stop]

Both start and stop are indices in the given string. Slice expression constructs the new string from start index to "stop - 1" index. Bot start and stop are optional. If start is omitted start index will be 0 and if stop index is omitted it will be last index in the given string. Following program illustrates the application of slice expression

package main

import "fmt"

func main() {
	msg := "Hello, welcome to Go. It is fast and elegant"

	//length
	fmt.Printf("Length of Message: %v \n", len(msg))

	// Slice till end
	fmt.Println(msg[7:])

	// Slice froms beginning
	fmt.Println(msg[:20])

	// Slice returns substring given start: stop
	fmt.Println(msg[7:20])

	// Full message
	fmt.Println(msg[:])
}

Length of Message: 44
welcome to Go. It is fast and elegant
Hello, welcome to Go
welcome to Go
Hello, welcome to Go. It is fast and elegant

Two strings can be concatenated by plus (+) operator

name := "Bob"
msg := "Hello " + name + "Welcome to Go"

Leave a Reply

Your email address will not be published. Required fields are marked *