Mojo 学习笔记（五）

继续 Mojo 的类型系统，今天说说字符串。

String

除了数值类型外，另一个常用的类型就是 String 了。

s = String("Mojo")
print(s)

for index in range(len(s)) :
    print(s[index])

bytes = s.as_bytes()
for b in bytes:
    print(b[])

输出:

Mojo
M
o
j
o
77
111
106
111

字符串里的下标 s[index] 都是按照字节的。其实可以 s._buffer 来看到：

print(Python.type(s._buffer))

会提示：

can not be converted from 'List[SIMD[si8, 1]]' to 'PythonObject'

内部就是个signed int 8 的串。内部字符是按照 UTF8 存储的，汉字就是3个字节，emoji是4个字节。

mojo

对比python中，字符串里的下标都是按照字符的（对比上图，切片的下标是不一样的）：

mojo

Mojo 里面的对象，类型确定后，就不能再改变了，例如：

s = String("Mojo")
print(Python.type(s), s)
s = 100
print(Python.type(s), s)
print(s[0])

会输出：

<class 'str'> Mojo
<class 'str'> 100
1

s 的类型还是String的，s = 100 应该是做了隐式转换转成String然后赋值进去了。

针对 String 类型，也定义了一些常用的方法，例如：lower, uppder, strip, find, rfind,, startswith, endswith 等等，也没有太多特殊的东西。

另外一种字符串类型是 StringLiteral （字符串字面量）

s = "Mojo"

这种方式定义出来的就是 StringLiteral，可以看做是指向一个字符串内存缓冲区的指针，可以给它赋其它的字符串值，但不能更改其类型，目前也不提供下标等等操作，因此下面的操作都是非法的：

s = "Mojo"
print(s[0])   # invalid
s = "Python"  # OK
s = 100       # invalid
s = String("Rust")  # invalid

print(s[0]) 会提示：

error: 'StringLiteral' is not subscriptable, it does not implement the __getitem__​/__setitem__​ or __refitem__​ methods

s=100 会提示：

error: cannot implicitly convert 'IntLiteral' value to 'StringLiteral' in assignment

s=String() 会提示：

error: cannot implicitly convert 'String' value to 'StringLiteral' in assignment

试到此处，感觉 Mojo 的成熟度还不是很高啊，有许多地方感觉和直觉不太符合，但也不知道是设计如此、是bug、还是尚未实现。

标签: 技术 Mojo