Add new SentenceTransformer model

Browse files

Files changed (13) hide show

.gitattributes +1 -0
1_Pooling/config.json +10 -0
README.md +1603 -0
config.json +49 -0
config_sentence_transformers.json +14 -0
configuration.py +114 -0
model.safetensors +3 -0
modeling.py +1319 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +51 -0
tokenizer.json +3 -0
tokenizer_config.json +62 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "word_embedding_dimension": 768,
+    "pooling_mode_cls_token": true,
+    "pooling_mode_mean_tokens": false,
+    "pooling_mode_max_tokens": false,
+    "pooling_mode_mean_sqrt_len_tokens": false,
+    "pooling_mode_weightedmean_tokens": false,
+    "pooling_mode_lasttoken": false,
+    "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,1603 @@

+---
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- dense
+- generated_from_trainer
+- dataset_size:1472
+- loss:MatryoshkaLoss
+- loss:MultipleNegativesRankingLoss
+base_model: dangvantuan/vietnamese-document-embedding
+widget:
+- source_sentence: Những điểm đặc biệt của chương trình học là gì?
+  sentences:
+  - 'Các phòng thí nghiệm này giúp sinh viên thực hành và nghiên cứu các phản ứng
+    hoá học, phân tích chất lượng sản phẩm và môi trường. CÁC ĐIỂM ĐẶC BIỆT
+    Chương trình học thực tiễn: Sinh viên có cơ hội tham gia các nghiên cứu thực tế
+    tại các phòng thí nghiệm của trường và các công ty, giúp họ phát triển các kỹ
+    năng thực hành và nghiên cứu hoá học. Môi trường học tập quốc tế: Sinh viên có
+    cơ hội tham gia các chương trình trao đổi sinh viên và hợp tác nghiên cứu với
+    các đối tác quốc tế trong lĩnh vực hoá học. Học bổng và cơ hội du học: Các chương
+    trình học bổng và cơ hội du học bậc thạc sĩ, tiến sĩ tại các trường đại học danh
+    tiếng trên thế giới. TRIỂN VỌNG NGHỀ NGHIỆP & CƠ HỘI VIỆC LÀM
+    Sinh viên tốt nghiệp ngành Hoá học có thể làm việc trong các lĩnh vực như:
+    Công nghiệp hoá chất và dược phẩm: Làm việc tại các công ty sản xuất hoá chất,
+    dược phẩm, sản xuất vật liệu và sản phẩm hoá học khác. Ngành thực phẩm và bảo
+    vệ môi trường: Nghiên cứu và phát triển các sản phẩm thực phẩm, phân tích chất
+    lượng thực phẩm, và xử lý chất thải hoá học trong công nghiệp.'
+  - 'Trường Đại học Ngoại Thương Cơ sở II
+    Tiếng Anh: Foreign Trade University Ho Chi Minh City Campus (FTU2) Trường Đại
+    học Ngoại Thương cơ sở II là cơ sở đào tạo phía Nam của Trường Đại học Ngoại thương
+    tại Hà Nội, đại học chuyên ngành kinh tế đầu ngành tại Việt Nam và thành viên
+    của Bộ Giáo dục và Đào tạo. Cơ sở này được thành lập dựa trên nhu cầu đào tạo
+    cán bộ trong lĩnh vực kinh tế và kinh doanh quốc tế tại các tỉnh thành phía Nam
+    trong giai đoạn hội nhập kinh tế quốc tế. Cơ sở được thành lập theo Quyết định
+    số 1485/GD-ĐT ngày 16/07/1993 của Bộ trưởng Bộ Giáo dục và Đào tạo Việt Nam. Tên
+    trường: Trường Đại học Ngoại thương (Cơ sở 2)
+    Tên tiếng Anh: Foreign Trade University (FTU)
+    Mã trường: NTS
+    Trực thuộc: Bộ Giáo dục và Đào tạo
+    Loại trường: Công lập
+    Loại hình đào tạo: Đại học – Sau đại học
+    Lĩnh vực: Kinh tế
+    Địa chỉ: Số 15 Đường D5, Khu Văn Thánh Bắc, Phường 25, Quận Bình Thạnh, TP Hồ
+    Chí Minh
+    Điện thoại:
+    Email:
+    Website: http://cs2.ftu.edu.vn/
+    Fanpage: https://www.facebook.com/ftu2hcmc/
+    Lịch sử
+    1962
+    Ngày 20/06/1962, theo Quyết định của Thủ tướng Chính phủ, Khoa Quan hệ Quốc tế
+    tách khỏi Trường Đại học Kinh tế - Tài chính để thành lập Trường Cán bộ Ngoại
+    giao - Ngoại thương trực thuộc Bộ Ngoại giao. Trụ sở ban đầu được đặt tại làng
+    Láng, tỉnh Hà Đông (nay là phường Láng Thượng, Hà Nội). 1967
+    Ngày 05/08/1967, theo đề nghị của Bộ Ngoại giao và Bộ Ngoại thương, Thủ tướng
+    Phạm Văn Đồng đã ký Quyết định số 123/CP, chia tách Trường Cán bộ Ngoại giao -
+    Ngoại thương thành hai trường:
+    Trường Ngoại giao (nay là Học viện Ngoại giao) trực thuộc Bộ Ngoại giao. Trường
+    Ngoại thương thuộc Bộ Ngoại thương (nay là Bộ Công Thương). 1985
+    Trường Đại học Ngoại thương chuyển từ Bộ Ngoại thương sang trực thuộc Bộ Đại học
+    và Trung học Chuyên nghiệp (nay là Bộ Giáo dục và Đào tạo). 1993
+    Ngày 16/07/1993, xuất phát từ nhu cầu đào tạo cán bộ kinh tế và kinh doanh quốc
+    tế tại Thành phố Hồ Chí Minh và các tỉnh thành phía Nam, Cơ sở II Trường Đại học
+    Ngoại thương tại TP.'
+  - 'Điểm xét tuyển được làm tròn đến 02 chữ số thập phân. - Điểm xét tuyển được xác
+    định như sau (làm tròn đến 02 chữ số thập phân): Điểm xét tuyển = [(ĐM1*HS môn
+    1+ ĐM2*HS môn 2 + ĐM3 * HS môn 3)*3]/(Tổng hệ số) + Điểm ưu tiên Khu vực + Điểm
+    ưu tiên đối tượng. (*) Điểm trúng tuyển ngành Luật, Luật kinh tế: tổ hợp Văn,
+    Sử, Địa cao hơn 1.5 điểm. (1) Ngành ngôn ngữ Anh, ngôn ngữ Trung Quốc, ngôn ngữ
+    Nhật, ngôn ngữ Hàn Quốc: Ngoại ngữ nhân hệ số 2. (2) Các ngành Khoa học máy tính,
+    Khoa học máy tính Chất lượng cao, Công nghệ thông tin, CTKT công trình xây dựng,
+    CNKT công trình xây dựng Chất lượng cao, Quản lý xây dựng: Toán nhân hệ số 2.
+    (3) Các ngành Chất lượng cao: Luật kinh tế, Ngôn ngữ Anh, Ngôn ngữ Trung Quốc,
+    Quản trị kinh doanh, Tài chính ngân hàng, Kế toán: Ngoại ngữ hệ số 2. VII.Điểm
+    chuẩn Trường ĐH Mở TP.HCM năm 2021 dựa vào kết quả học tập THPT(học bạ)
+    i.'
+- source_sentence: Nguyên tắc xét tuyển của Trường được áp dụng như thế nào khi thí
+    sinh đăng ký nhiều nguyện vọng hoặc nhiều phương thức xét tuyển?
+  sentences:
+  - '4. Đối với phương thức kết hợp thi tuyển và xét tuyển
+    4.1. Thí sinh dự xét tuyển ngành Giáo dục Mầm non trình độ đại học
+    Phải tham gia kỳ thi năng khiếu do Trường Đại học Sư phạm Thành phố Hồ Chí Minh
+    tổ chức và có kết quả đạt từ 5,0 điểm trở lên;
+    Đối với thí sinh xét tuyển sử dụng kết quả thi tốt nghiệp THPT năm 2024: ngưỡng
+    điểm đảm bảo chất lượng đầu vào, điều kiện nhận hồ sơ đăng ký xét tuyển được thông
+    báo chính thức sau khi Bộ Giáo dục và Đào tạo xác định ngưỡng đảm bảo chất lượng
+    đầu vào đại học (căn cứ kết quả kỳ thi tốt nghiệp THPT năm 2024). Đối với thí
+    sinh xét tuyển sử dụng kết quả học tập THPT: chỉ áp dụng đối với thí sinh tốt
+    nghiệp THPT năm 2024 đồng thời phải thỏa một trong hai điều kiện sau:
+    + Có học lực lớp 12 xếp loại giỏi;
+    + Có điểm xét tốt nghiệp THPT từ 8,0 trở lên. 4.2. Thí sinh dự xét tuyển ngành
+    Giáo dục Mầm non trình độ cao đẳng
+    Phải tham gia kỳ thi năng khiếu do Trường Đại học Sư phạm Thành phố Hồ Chí Minh
+    tổ chức và có kết quả đạt từ 5,0 điểm trở lên;
+    Đối với thí sinh xét tuyển sử dụng kết quả thi tốt nghiệp THPT năm 2024: ngưỡng
+    điểm đảm bảo chất lượng đầu vào, điều kiện nhận hồ sơ đăng ký xét tuyển được thông
+    báo chính thức sau khi Bộ Giáo dục và Đào tạo xác định ngưỡng đảm bảo chất lượng
+    đầu vào đại học (căn cứ kết quả kỳ thi tốt nghiệp THPT năm 2024). Đối với thí
+    sinh xét tuyển sử dụng kết quả học tập THPT: chỉ áp dụng đối với thí sinh tốt
+    nghiệp THPT năm 2024 đồng thời phải thỏa một trong hai điều kiện sau:
+    + Có học lực lớp 12 xếp loại khá;
+    + Có điểm xét tốt nghiệp THPT từ 6,5 trở lên. 4.3. Thí sinh dự xét tuyển ngành
+    Giáo dục Thể chất
+    Phải tham gia kỳ thi năng khiếu do Trường Đại học Sư phạm Thành phố Hồ Chí Minh
+    tổ chức và có kết quả đạt từ 5,0 điểm trở lên;
+    Đối với thí sinh xét tuyển sử dụng điểm thi tốt nghiệp THPT năm 2024: ngưỡng điểm
+    đảm bảo chất lượng đầu vào, điều kiện nhận hồ sơ đăng ký xét tuyển được thông
+    báo chính thức sau khi Bộ Giáo dục và Đào tạo xác định ngưỡng đảm bảo chất lượng
+    đầu vào đại học (căn cứ kết quả kỳ thi tốt nghiệp THPT năm 2024);
+    Đối với thí sinh xét tuyển sử dụng kết quả học tập THPT: chỉ áp dụng đối với thí
+    sinh tốt nghiệp THPT năm 2024 đồng thời thỏa thêm một trong các điều kiện sau:
+    + Có học lực lớp 12 xếp loại khá trở lên;
+    + Có điểm xét tốt nghiệp THPT từ 6,5 trở lên;
+    + Là vận động viên cấp 1, kiện tướng, vận động viên đã từng đoạt huy chương tại
+    Hội khỏe Phù Đổng, các giải trẻ quốc gia và quốc tế hoặc giải vô địch quốc gia
+    và quốc tế có điểm thi năng khiếu do trường tổ chức đạt loại xuất sắc (từ 9,0
+    trở lên theo thang điểm 10,0).'
+  - 'Danh mục các ngành điều kiện nộp hồ sơ xét tuyển (xem tại đây). Quy định chứng
+    chỉ tiếng Anh quốc tế tương đương (xem tại đây)
+    6. Xét tuyển thẳng, ưu tiên xét tuyển theo Quy chế của Bộ GD&ĐT – Mã phương thức
+    301: Thực hiện theo quy định của Bộ GD&ĐT
+    7. Các lưu ý khi đăng ký NVXT và nguyên tắc xét tuyển trên hệ thống Bộ
+    a. Các lưu ý khi đăng ký NVXT
+    Thí sinh nên tra cứu thông tin các nguyện vọng đăng ký xét tuyển vào TDTU theo
+    phương thức riêng tại: https://tracuuxettuyen.tdtu.edu.vn trước khi đăng ký nguyện
+    vọng lên hệ thống của Bộ GD&ĐT. Số CMND/CCCD thí sinh đã đăng ký xét tuyển trên
+    hệ thống của TDTU; đăng ký phương thức 4 trên hệ thống của Đại học Quốc gia TP.HCM
+    phải trùng khớp với số CMND/CCCD sử dụng đăng ký tài khoản trên hệ thống của Bộ
+    GD&ĐT. Trường hợp thí sinh đã đăng ký số CMND/CCCD không trùng khớp nhau giữa
+    các hệ thống trên, thí sinh phải liên hệ với TDTU để được hỗ trợ cập nhật lại
+    số CMND/CCCD cho trùng khớp với hệ thống của Bộ trước khi đăng ký nguyện vọng.
+    Thí sinh sẽ không đủ điều kiện xét tuyển nếu không sử dụng cùng 1 số CMND/CCCD
+    đăng ký giữa các hệ thống trên. Thí sinh xét tuyển vào chương trình đại học bằng
+    tiếng Anh, chương trình liên kết quốc tế nhưng không nộp chứng chỉ tiếng Anh theo
+    quy định, không dự thi năng lực tiếng Anh hoặc dự thi năng lực tiếng Anh kết quả
+    không đạt nếu đủ điểm trúng tuyển sẽ trúng tuyển vào chương trình dự bị tiếng
+    Anh. Khi thí sinh làm thủ tục nhập học, Nhà trường sẽ tổ chức cho thí sinh thi
+    đánh giá năng lực tiếng Anh. Nếu kết quả thi đánh giá năng lực của thí sinh đạt
+    trình độ tiếng Anh theo yêu cầu của chương trình (B1 đối với chương trình đại
+    học bằng tiếng Anh, B2 đối với chương trình liên kết đào tạo quốc tế) sẽ được
+    nhập học vào chương trình chính thức. Trường hợp chưa đạt năng lực tiếng Anh đầu
+    vào, thí sinh sẽ học chương trình dự bị tiếng Anh. b. Nguyên tắc xét tuyển
+    Nếu một NVXT của thí sinh đăng ký vào Trường có chọn nhiều căn cứ xét tuyển và
+    tương ứng có nhiều phương thức xét tuyển (Phương thức 1, 2, 3, 4) thì Trường sẽ
+    thực hiện việc xét tuyển theo thứ tự ưu tiên lần lượt của các phương thức như
+    sau: Phương thức 1, Phương thức 3, Phương thức 4, Phương thức 2. Thí sinh có nhiều
+    NVXT đủ điều kiện trúng tuyển thì chỉ được công nhận trúng tuyển và gọi nhập học
+    theo nguyện vọng cao nhất.'
+  - 'Thí sinh có thể dự thi cả 2 đợt thi năng khiếu để dùng điểm cao nhất của 2 đợt
+    thi xét tuyển (đợt thi 1 dự kiến ngày 15-17/08/2021; đợt thi 2 dự kiến ngày 17-20/8/2021).
+    TDTU không nhận điểm thi năng khiếu của các Trường khác chuyển sang. Xem chi tiết
+    thông báo thi năng khiếu tại https://admission.tdtu.edu.vn
+    + Thí sinh thuộc đối tượng 2- đợt 2 xét tuyển vào chương trình đại học bằng tiếng
+    Anh phải có Chứng chỉ tiếng Anh quốc tế tương đương IELTS 5.0 trở lên (còn thời
+    hạn trong vòng 2 năm tính đến ngày 01/10/2021); Thí sinh không có chứng chỉ tiếng
+    Anh quốc tế tương đương IELTS 5.0 trở lên còn thời hạn theo quy định của TDTU
+    phải đăng ký dự thi Năng lực tiếng Anh do TDTU tổ chức (trừ ngành Ngôn ngữ Anh
+    chỉ nhận chứng chỉ tiếng Anh quốc tế theo quy định) tại website: https://thinangkhieu.tdtu.edu.vn.'
+- source_sentence: Những đối tượng nào có thể đăng ký xét tuyển vào Đại học Sư phạm
+    Kỹ thuật TP.HCM và cần đáp ứng các điều kiện gì?
+  sentences:
+  - 'Hồ Chí Minh được thành lập theo Quyết định số 1485/GD-ĐT. Cơ sở vật chất
+    Địa chỉ: Số 15, Đường D5, Phường 25, Quận Bình Thạnh, TP. Hồ Chí Minh. Ban đầu,
+    do chưa có cơ sở vật chất riêng, Cơ sở II phải thuê cơ sở của Trường Cao đẳng
+    Kinh tế Đối ngoại. Qua thời gian, trường đã xây dựng được cơ sở mới đáp ứng nhu
+    cầu giảng dạy và học tập. Diện tích khuôn viên: Gần 5.000 m². Khu vực giảng dạy
+    chính: Sảnh A và sảnh B, đồng thời là nơi đặt trụ sở Ban Giám hiệu và các khoa,
+    phòng ban quản lý. Trang thiết bị: Nhiều phòng học và phòng chức năng được trang
+    bị hiện đại. Ngoài ra, trong khuôn viên còn có phân viện VJCC cơ sở TP. Hồ Chí
+    Minh, được hỗ trợ xây dựng bởi nguồn vốn từ Chính phủ Nhật Bản, tương tự như phân
+    viện tại Hà Nội. Cơ cấu tổ chức và đội ngũ cán bộ, giáo viên
+    Trong thời gian đầu mới th��nh lập, Cơ sở II chỉ có 02 cán bộ, và hầu hết các hoạt
+    động được chỉ đạo trực tiếp từ Cơ sở I tại Hà Nội. Tuy nhiên, với quy mô đào tạo
+    ngày càng tăng, Cơ sở II đã nhanh chóng củng cố cơ cấu tổ chức và đội ngũ cán
+    bộ, giáo viên. Hiện tại, Cơ sở II có hơn 100 cán bộ, giáo viên cơ hữu, công tác
+    tại 11 Ban và 05 Bộ môn. Các Ban
+    Ban Tổ chức - Hành chính
+    Ban Kế hoạch - Tài chính
+    Ban Quản lý đào tạo
+    Ban Công tác chính trị & Sinh viên
+    Ban Đào tạo quốc tế
+    Ban Quản trị thiết bị
+    Ban Quản lý Khoa học & Hợp tác quốc tế
+    Ban Khảo thí & Đảm bảo chất lượng
+    Ban Truyền thông & Quan hệ đối ngoại
+    Ban Thư viện
+    Ban Công tác Đảng & Đoàn thể
+    Các Bộ môn
+    Bộ môn Khoa học cơ bản
+    Bộ môn Kinh doanh & Thương mại quốc tế
+    Bộ môn Ngoại ngữ
+    Bộ môn Kinh tế - Luật
+    Bộ môn Quản trị kinh doanh & Tài chính - Kế toán'
+  - 'THÔNG TIN TUYỂN SINH Đại học Sư phạm Kỹ thuật TP.HCM
+    . Thông tin chung
+    1. Thời gian xét tuyển
+    Theo lịch tuyển sinh chung của Bộ GD&ĐT và kế hoạch tuyển sinh của trường công
+    bố cụ thể trên website. 2. Đối tượng tuyển sinh
+    Thí sinh đã tốt nghiệp THPT. 3. Phạm vi tuyển sinh
+    Tuyển sinh trong cả nước. 4. Phương thức tuyển sinh
+    4.1. Phương thức xét tuyển
+    Phương thức 1: Xét tuyển học bạ THPT. Phương thức 2: Xét tuyển thí sinh theo kết
+    quả điểm thi tốt nghiệp THPT năm 2024 theo các tổ hợp môn xét tuyển từng ngành
+    học. Phương thức 3: Xét tuyển thẳng, ưu tiên xét tuyển thẳng. 4.2. Ngưỡng đảm
+    bảo chất lượng đầu vào, điều kiện nhận ĐKXT
+    Phương thức xét tuyển bằng điểm thi THPT 2024: thí sinh phải tốt nghiệp THPT và
+    thỏa điều kiện ngưỡng đảm bảo chất lượng đầu vào của Trường. Thông báo ngưỡng
+    đảm bảo sau khi thí sinh có kết quả thi THPT. Phương thức xét tuyển bằng học bạ
+    THPT tốt nghiệp (tốt nghiệp THPT 2024): thí sinh tốt nghiệp THPT và điểm trung
+    bình học bạ mỗi môn học theo tổ hợp đăng ký xét tuyển từ 5,0 trở lên. Hồi đồng
+    thi tuyển uy quyền cho những thành viên thường trực Hội đồng tuyển sinh quyết
+    định điểm trúng tuyển các phương thức xét. Điềm chuẩn ngành Sư phạm tiếng Anh
+    theo các phương thức xét tuyển sớm sẽ được điều chỉnh khi có chỉ tiêu được giao
+    của Bộ GD&ĐT. 4.3.'
+  - '4. CÁC NGÀNH ĐÀO TẠO
+    a. ĐẠI HỌC
+    Cử nhân Sư phạm Tin học
+    Cử nhân Công nghệ Thông tin
+    b. SAU ĐẠI HỌC
+    Thạc sĩ Khoa học máy tính
+    vii. Khoa Vật lý
+    1. CHẤT LƯỢNG ĐÀO TẠO
+    ĐÀO TẠO CỬ NHÂN (4 NĂM)
+    CN Sư phạm Vật lý, CN Vật lý học
+    CN Sư phạm Công nghệ
+    TUYỂN SINH: 100 - 150 SV
+    ĐÀO TẠO CAO HỌC (2 NĂM)
+    Bắt đầu đào tạo Thạc sĩ từ 1999
+    ThS Lý luận và phương pháp dạy học bộ môn Vật lý
+    ThS Vật Lý Nguyên tử và hạt nhân
+    TUYỂN SINH: 15 - 25 HV/năm
+    2. CHẤT LƯỢNG GIẢNG VIÊN
+    ĐỘI NGŨ GIẢNG VIÊN: 35
+    Giảng viên: 35
+    Giáo sư : 1
+    Phó Giáo sư Tiến sĩ: 4
+    Tiến sĩ: 17
+    Thạc sĩ: 10
+    Cử nhân: 3
+    3. MỤC TIÊU ĐÀO TẠO
+    Đào tạo cử nhân Vật lý học, có phẩm chất chính trị, đạo đức và sức khỏe tốt, hiểu
+    và vận dụng các tri thức cơ bản của Vật lý học theo định hướng chuyên ngành. Sau
+    khi tốt nghiệp, người học có đủ năng lực để làm việc trong môi trường nghiên cứu,
+    sản xuất kinh doanh có sử dụng kiến thức Vật lý học cũng như có thể tiếp tục theo
+    các bậc học cao hơn. Đào tạo giáo viên có trình độ cử nhân Sư phạm Vật lý (hệ
+    chính quy, chính quy địa phương, hệ chuyên tu, tại chức). Sau khi tốt nghiệp,
+    người học có phẩm chất chính trị, đạo đức và sức khỏe tốt, hiểu và vận dụng các
+    tri thức cơ bản của Vật lý học, lý luận và phương pháp giảng dạy Vật lý ở trường
+    trung học. Đào tạo giáo viên dạy Công nghệ bậc Trung học cơ sở và Trung học phổ
+    thông. Sau khi tốt nghiệp, người học có phẩm chất chính trị, đạo đ��c và sức khỏe
+    tốt, hiểu và vận dụng các tri thức khoa học, công nghệ nền tảng vào trong dạy
+    học môn Công nghệ ở trường phổ thông. Sau khi tốt nghiệp, người học có đủ năng
+    lực để làm việc trong môi trường nghiên cứu, sản xuất kinh doanh có sử dụng kiến
+    thức khoa học, công nghệ cũng như có thể tiếp tục theo các bậc học cao hơn.'
+- source_sentence: Quá trình hình thành và phát triển của Đại học Kinh tế Thành phố
+    Hồ Chí Minh diễn ra như thế nào?
+  sentences:
+  - '1. Điểm trúng tuyển
+    Phương thức xét tuyển theo kết quả học tập THPT – Đợt 2 (PT1-Đ2), ưu tiên xét
+    tuyển theo quy định của TDTU dành cho học sinh trường chuyên trên cả nước và một
+    số trường trọng điểm ở TP.HCM – Đợt 2 (PT3-ĐT1-Đ2); ưu tiên xét tuyển theo quy
+    định của TDTU dành cho học sinh có chứng chỉ tiếng Anh quốc tế tương đương IELTS
+    5.0 trở lên – Đợt 2 (PT3-ĐT2-Đ2): Điểm xét tuyển được thực hiện theo đúng đề án
+    tuyển sinh đại học năm 2022, thang điểm 40 và được làm tròn đến 02 chữ số thập
+    phân (đã bao gồm điểm ưu tiên khu vực, đối tượng, hệ số trường THPT, điểm ưu tiên
+    thành tích học sinh giỏi). Phương thức xét tuyển theo điểm thi THPT năm 2022 (PT2):
+    Điểm xét tuyển được thực hiện theo đúng đề án tuyển sinh đại học năm 2022, là
+    tổng điểm của 3 môn theo tổ hợp (có nhân hệ số môn theo tổ hợp, ngành xét
+    tuyển theo thang điểm 40), cộng với điểm ưu tiên khu vực, đối tượng theo thang
+    điểm 40 (nếu có), được làm tròn đến 2 chữ số thập phân theo quy định của Bộ GD&ĐT.
+    Phương thức xét tuyển theo điểm thi đánh giá năng lực của Đại học Quốc gia TP.HCM
+    năm 2022 (PT5): Điểm xét tuyển được thực hiện theo đúng đề án tuyển sinh đại học
+    năm 2022 theo thang điểm 1200 (đã bao gồm điểm ưu tiên khu vực, đối tượng theo
+    thang điểm 1200)
+    Phương thức xét tuyển theo kết quả học tập THPT -Đợt 1 (PT1-Đ1) và ưu tiên xét
+    tuyển theo quy định của TDTU đợt 1 (PT3-Đ1), điểm trúng tuyển theo thông báo Kết
+    quả sơ tuyển PT1, PT3-ĐT1 các ngành trình độ đại học chính quy 2022-Đợt 1 ngày
+    30/6/2022 của HĐTS Trường. Bảng điểm trúng tuyển theo các phương thức như sau:
+    Here''s the updated table based on your additional data. I''ve kept the structure
+    consistent, with the text "HHMT≥6.0" moved to the "Điểm TT PT5" column where relevant:
+    STT Mã ngành Tên ngành Điểm TT PT1-Đ2 Điểm TT PT2 Điểm TT PT3-ĐT1-Đ2 Điểm TT PT3-ĐT2-Đ2
+    Điểm TT PT5 Chương trình tiêu chuẩn 1 7210402 Thiết kế công nghiệp 26.5 23 30
+    650 HHMT≥6.0 2 7210403 Thiết kế đồ họa 29.5 27 32 700 HHMT≥6.0 3 7210404 Thiết
+    kế thời trang 26.5 24 30 650 HHMT≥6.0 4 7220201 Ngôn ngữ Anh 37 34 36 800 5 7220204
+    Ngôn ngữ Trung Quốc 37 33 35 800 6 7310301 Xã hội học 31.5 28.5 31 650 7 7310630
+    Việt Nam học (Chuyên ngành: Du lịch và lữ hành) 34 31.8 33 700 8 7310630Q Việt
+    Nam học (Chuyên ngành: Du lịch và quản lý du lịch) 34 31.8 33 700 9 7340101 Quản
+    trị kinh doanh (Chuyên ngành: Quản trị nguồn nhân lực) 37 33.6 36 800 10 7340101N
+    Quản trị kinh doanh (Chuyên ngành: Quản trị nhà hàng - khách sạn) 35.75 30.5 35
+    800 11 7340115 Marketing 37.75 34.8 37 870 12 7340120 Kinh doanh quốc tế 37.5
+    34.5 37 870 13 7340201 Tài chính - Ngân hàng 36.75 33.6 35.25 750 14 7340301 Kế
+    toán 36 33.3 34.25 720 15 7340408 Quan hệ lao động (Chuyên ngành Quản lý Quan
+    hệ lao động, Chuyên ngành Hành vi tổ chức) 28 27 31 700 16 7380101 Luật 36.5 33.5
+    35.5 720 17 7420201 Công nghệ sinh học 33.5 26.5 32 680 18 7440301 Khoa học môi
+    trường 26 22 31 650 19 7460112 Toán ứng dụng 31.5 31.1 31 680 20 7460201 Thống
+    kê 28 29.1 31 680 21 7480101 Khoa học máy tính 38 35 35 850 22 7480102 Mạng máy
+    tính và truyền thông dữ liệu 36.25 34.5 32.5 800 23 7480103 Kỹ thuật phần mềm
+    38 35.4 35.5 850 24 7510406 Công nghệ kỹ thuật môi trường (Chuyên ngành Cấp thoát
+    nước và môi trường nước) 26 22 30 650 25 7520114 Kỹ thuật cơ điện tử 33 28.5 32
+    680 26 7520201 Kỹ thuật điện 31 27.5 32 650 27 7520207 Kỹ thuật điện tử - viễn
+    thông 31 29.5 32 650 28 7520216 Kỹ thuật điều khiển và tự động hóa 33 31.7 32
+    680 29 7520301 Kỹ thuật hóa học 34 28.5 32 680 30 7580101 Kiến trúc 28 26 32 680
+    HHMT≥6.0 31 7580105 Quy hoạch vùng và đô thị 27 23 30 650 32 7580108 Thiết kế
+    nội thất 27 24 32 650 HHMT≥6.0 33 7580201 Kỹ thuật xây dựng 29 25 32 650 34 7580205
+    Kỹ thuật xây dựng công trình giao thông 27 23 30 650 35 7720201 Dược học 36 HSG
+    lớp 12 33.2 HSG lớp 12 800 HSG lớp 12 36 7760101 Công tác xã hội 27 25.3 30 650
+    37 7810301 Quản lý thể dục thể thao (Chuyên ngành kinh doanh thể thao và tổ chức
+    sự kiện) 31.5 27 30 650 38 7810302 Golf 27 23 30 650 39 7850201 Bảo hộ lao động
+    27 23 30 650  CHƯƠNG TRÌNH CHẤT LƯỢNG CAO 1 F7210403 Thiết kế đồ họa - Chương
+    trình Chất lượng cao 26.5 23 30 650 HHMT≥6.0 2 F7220201 Ngôn ngữ Anh – Chương
+    trình Chất lượng cao 34 29.9 32 700 3 F7310630Q Việt Nam học (Chuyên ngành Du
+    lịch và Quản lý du lịch) - Chương trình Chất lượng cao 27 27 32 650 4 F7340101
+    Quản trị kinh doanh (Chuyên ngành: Quản trị nguồn nhân lực) - Chương trình Chất
+    lượng cao 35.5 32.7 33 700 5 F7340101N Quản trị kinh doanh (Chuyên ngành: Quản
+    trị nhà hàng - khách sạn) - Chương trình Chất lượng cao 33 29.1 32 700 6 F7340115
+    Marketing - Chương trình Chất lượng cao 36 33.5 35 750 7 F7340120 Kinh doanh quốc
+    tế - Chương trình Chất lượng cao 36.5 32.8 36 750 8 F7340201 Tài chính - Ngân
+    hàng - Chương trình Chất lượng cao 33 30.1 32 700 9 F7340301 Kế toán - Chương
+    trình Chất lượng cao 31 29.2 32 650 10 F7380101 Luật - Chương trình Chất lượng
+    cao 32 32.1 32 650 11 F7420201 Công nghệ sinh học - Chương trình Chất lượng cao
+    27 22 30 650 12 F7480101 Khoa học máy tính - Chương trình Chất lượng cao 36.25
+    34.5 32 800 13 F7480103 Kỹ thuật phần mềm - Chương trình Chất lượng cao 36.25
+    34.5 32 800 14 F7520201 Kỹ thuật điện - Chương trình Chất lượng cao 27 22 30 650
+    15 F7520207 Kỹ thuật điện tử - viễn thông - Chương trình Chất lượng cao 27 22
+    30 650 16 F7520216 Kỹ thuật điều khiển và tự động hóa - Chương trình Chất lượng
+    cao 27 25 30 650 17 F7580201 Kỹ thuật xây dựng - Chương trình Chất lượng cao 27
+    22 30 650  CHƯƠNG TRÌNH ĐẠI HỌC BẰNG TIẾNG ANH
+    Yêu cầu về tiếng Anh đầu vào:
+    Thí sinh nước ngoài ở các nước có ngôn ngữ chính là tiếng Anh không yêu cầu Chứng
+    chỉ tiếng Anh đầu vào quốc tế;
+    Thí sinh Việt Nam và thí sinh ở các nước không có ngôn ngữ chính là tiếng Anh:
+    phải có Chứng chỉ IELTS 5.0 trở lên hoặc tương đương (có giá trị từ ngày 01/10/2020
+    và còn giá trị đến ngày 01/10/2022); hoặc phải dự thi đánh giá năng lực tiếng
+    Anh bằng Hệ thống đánh giá năng lực tiếng Anh theo chuẩn quốc tế của TDTU để được
+    xác nhận đủ điều kiện tiếng Anh theo học chương trình (trừ Ngành ngôn ngữ Anh
+    phải có chứng chỉ tiếng Anh quốc tế tương đương IELTS 5.0 trở lên theo quy định).
+    Trường hợp số lượng học viên nhập học đủ điều kiện học chính thức ít hơn sĩ số
+    tối thiểu để mở lớp, người học được tư vấn để bảo lưu kết quả tuyển sinh, hoặc
+    chuyển qua các ngành/chương trình khác (nếu đáp ứng được tiêu chí tuyển đầu vào
+    của ngành/chương trình đó). Chương trình đại học bằng tiếng Anh:
+    STT Mã ngành Tên ngành Điểm TT PT1-Đ2 Điểm TT PT2 Điểm TT PT3-ĐT1-Đ2 Điểm TT PT3-ĐT2-Đ2
+    Điểm TT PT5 1 FA7220201 Ngôn ngữ Anh – Chương trình đại học bằng tiếng Anh 32
+    25 30 34.5 700 2 FA7310630Q Việt Nam học (Chuyên ngành Du lịch và Quản lý du lịch)
+    - Chương trình đại học bằng tiếng Anh 28 24 28 28 650 3 FA7340101N Quản trị kinh
+    doanh (Chuyên ngành: Quản trị nhà hàng - khách sạn) - Chương trình đại học bằng
+    tiếng Anh 30 27 30 30 650 4 FA7340115 Marketing - Chương trình đại học bằng tiếng
+    Anh 34 27 32 36 700 5 FA7340120 Kinh doanh quốc tế - Chương trình đại học bằng
+    tiếng Anh 34 27 32 36 700 6 FA7340201 Tài chính ngân hàng - Chương trình đại học
+    bằng tiếng Anh 28 24 28 28 650 7 FA7340301 Kế toán (Chuyên ngành: Kế toán quốc
+    tế) - Chương trình đại học bằng tiếng Anh 28 24 28 28 650 8 FA7420201 Công nghệ
+    sinh học - Chương trình đại học bằng tiếng Anh 28 24 28 28 650 9 FA7480101 Khoa
+    học máy tính - Chương trình đại học bằng tiếng Anh 30 24 30 30 650 10 FA7480103
+    Kỹ thuật phần mềm - Chương trình đại học bằng tiếng Anh 30 24 30 30 650 11 FA7520216
+    Kỹ thuật điều khiển và tự động hóa - Chương trình đại học bằng tiếng Anh 28 24
+    28 28 650 12 FA7580201 Kỹ thuật xây dựng - Chương trình đại học bằng tiếng Anh
+    28 24 28 28 650
+    Chương trình học tại Phân hiệu Khánh Hòa:
+    STT Mã ngành Tên ngành Điểm TT PT1-Đ2 Điểm TT PT2 Điểm TT PT3-ĐT1-Đ2 Điểm TT PT3-ĐT2-Đ2
+    Điểm TT PT5 1 N7220201 Ngôn ngữ Anh - Chương trình học Phân hiệu Khánh Hòa 28
+    24 31 650 2 N7310630 Việt Nam học (Chuyên ngành: Du lịch và lữ hành) - Chương
+    trình học Phân hiệu Khánh Hòa 27 22 30 650 3 N7340101N Quản trị kinh doanh, Chuyên
+    ngành: Quản trị nhà hàng - khách sạn - Chương trình học Phân hiệu Khánh Hòa 29
+    24 31 650 4 N7340115 Marketing - Chương trình học Phân hiệu Khánh Hòa 29 24 31
+    650 5 N7340301 Kế toán - Chương trình học Phân hiệu Khánh Hòa 27 22 30 650 6 N7380101
+    Luật - Chương trình học Phân hiệu Khánh Hòa 27 22 30 650 7 N7480103 Kỹ thuật phần
+    mềm - Chương trình học Phân hiệu Khánh Hòa 27 22 31 650 CHƯƠNG TRÌNH LIÊN KẾT
+    QUỐC TẾ
+    Yêu cầu về tiếng Anh đầu vào:
+    Thí sinh phải đạt trình độ tiếng Anh đầu vào từ B2 trở lên hoặc tương đương để
+    được công nhận trúng tuyển vào chương trình chính thức.Thí sinh có thể nộp chứng
+    chỉ IELTS 5.5 hoặc các chứng chỉ quốc tế tương đương để xét tiếng Anh đầu vào;
+    hoặc phải dự thi đánh giá năng lực tiếng Anh đầu khóa bằng Hệ thống đánh giá năng
+    lực tiếng Anh theo chuẩn quốc tế của TDTU để được xác nhận đủ điều kiện tiếng
+    Anh theo học chương trình. Ngoại lệ:
+    Nếu tiếng Anh chưa đạt chuẩn B2, nhưng người học vẫn muốn học chương trình liên
+    kết đào tạo quốc tế, thì được xét vào chương trình dự bị tiếng Anh (liên kết quốc
+    tế) và phải tham gia học bổ túc tiếng Anh tại TDTU cho đến khi đạt trình độ tương
+    đương chuẩn nói trên để được “quyết định nhập học và công nhận là sinh viên”.
+    Thời gian học tiếng Anh tối đa là 2 năm và tùy năng lực đầu vào qua kết quả đánh
+    giá đầu vào xếp lớp của TDTU. Sau thời gian học chương trình dự bị tiếng Anh,
+    nếu vẫn chưa đạt chuẩn tiếng Anh trình độ B2 hoặc tương đương; người học phải
+    thôi học hoặc có thể xin chuyển sang các chương trình khác (nếu vẫn bảo đảm được
+    các tiêu chí tuyển sinh đầu vào tương ứng của các ngành/chương trình này theo
+    đúng năm tuyển sinh ). Trường hợp số lượng học viên nhập học đủ điều kiện học
+    chính thức ít hơn sĩ số tối thiểu để mở lớp, người học được tư vấn để bảo lưu
+    kết quả tuyển sinh, hoặc chuyển qua các ngành/chương trình khác (nếu đáp ứng được
+    tiêu chí tuyển đầu vào của ngành/chương trình đó). STT Mã ngành Tên ngành Điểm
+    TT PT1-Đ2 Điểm TT PT2 Điểm TT PT3-ĐT1-Đ2 Điểm TT PT3-ĐT2-Đ2 Điểm TT PT5 1 K7340101
+    Quản trị kinh doanh (song bằng, 2+2) - Chương trình liên kết Đại học Kinh tế Praha
+    (Cộng hòa Séc) 28 24 28 28 650 2 K7340101N Quản trị nhà hàng khách sạn (song bằng,
+    2.5+1.5) - Chương trình liên kết Đại học Taylor''s (Malaysia) 28 24 28 28 650
+    3 K7340120 Quản trị kinh doanh quốc tế (đơn bằng, 3+1) - Chương trình liên kết
+    Đại học Khoa học và công nghệ Lunghwa (Đài Loan) 28 24 28 28 650 4 K7340201 Tài
+    chính (song bằng, 2+2) - Chương trình liên kết Đại học Feng Chia (Đài Loan) 28
+    24 28 28 650 5 K7340201S Tài chính (đơn bằng, 3+1) - Chương trình liên kết Đại
+    học Khoa học và công nghệ Lunghwa (Đài Loan) 28 24 28 28 650 6 K7340201X Tài chính
+    và kiểm soát (song bằng, 3+1) - Chương trình liên kết Đại học Khoa học ứng dụng
+    Saxion (Hà Lan) 28 24 28 28 650 7 K7340301 Kế toán (song bằng, 3+1) - Chương trình
+    liên kết Đại học West of England, Bristol (Anh) 28 24 28 28 650 8 K7480101 Khoa
+    học máy tính & Công nghệ tin học (đơn bằng, 2+2) - Chương trình liên kết Đại học
+    Khoa học và công nghệ Lunghwa (Đài Loan) 28 24 28 28 650 9 K7480101L Công nghệ
+    thông tin (song bằng, 2+2) - Chương trình liên kết Đại học La Trobe (Úc) 28 24
+    28 28 650 10 K7520201 Kỹ thuật điện – điện tử (song bằng, 2.5+1.5) - Chương trình
+    liên kết Đại học Khoa học ứng dụng Saxion (Hà Lan) 28 24 28 28 650 11 K7580201
+    Kỹ thuật xây dựng (song b���ng, 2+2) - Chương trình liên kết Đại học La Trobe (Úc)
+    28 24 28 28 650 Đính kèm phụ lục điểm trúng tuyển chi tiết theo từng phương thức
+    Phụ lục điểm trúng tuyển chi tiết phương thức 1-đợt 2 (tại đây)
+    Phụ lục điểm trúng tuyển chi tiết phương thức 2 (tại đây)
+    Phụ lục điểm trúng tuyển chi tiết phương thức 3-đợt 2 (tại đây)
+    Thí sinh tra cứu kết quả trúng tuyển từ 17h ngày 17/9/2022 tại website https://tracuuxettuyen.tdtu.edu.vn
+    Lưu ý: Thí sinh đủ điểm trúng tuyển của TDTU công bố nhưng không có trong danh
+    sách trúng tuyển chính thức có thể do thí sinh đã đăng ký không chính xác nguyện
+    vọng trên hệ thống Bộ GD&ĐT hoặc đã trúng tuyển ở nguyện vọng khác có thứ tự ưu
+    tiên cao hơn.'
+  - 'Đại học Kinh tế Thành phố Hồ Chí Minh (UEH)
+    Đại học Kinh tế Thành phố Hồ Chí Minh (tiếng Anh: University of Economics Ho Chi
+    Minh City – UEH), còn được gọi là Đại học UEH, là một đại học công lập đa ngành
+    trực thuộc Bộ Giáo dục và Đào tạo. UEH nằm trong nhóm các trường đại học trọng
+    điểm quốc gia, dẫn đầu trong đào tạo khối ngành kinh tế tại Việt Nam. UEH không
+    chỉ là một trụ cột quan trọng trong hệ thống giáo dục bậc cao mà còn là trung
+    tâm nghiên cứu các chính sách kinh tế và quản lý cho chính phủ cùng các doanh
+    nghiệp lớn. UEH đã đào tạo nhiều lãnh đạo cấp cao cho các tập đoàn đa quốc gia
+    nổi tiếng trong và ngoài nước. Lịch sử hình thành và phát triển
+    1976: Thành lập với tên gọi Trường Đại học Kinh tế trực thuộc Bộ Đại học và Trung
+    học chuyên nghiệp. 1996: Sáp nhập với hai đơn vị khác, trở thành Trường Đại học
+    Kinh tế trực thuộc Đại học Quốc gia Thành phố Hồ Chí Minh. 2000: Tách ra khỏi
+    Đại học Quốc gia Thành phố Hồ Chí Minh, trở thành Trường Đại học Kinh tế Thành
+    phố Hồ Chí Minh trực thuộc Bộ Giáo dục và Đào tạo. 2021: Tái cấu trúc, thành lập
+    các trường thành viên và định hướng phát triển thành đại học đa ngành, đa lĩnh
+    vực. 2023: Chính thức chuyển đổi thành Đại học Kinh tế Thành phố Hồ Chí Minh.
+    Cơ sở vật chất và hoạt động
+    Hiện nay, UEH sở hữu: - 10 cơ sở giảng dạy tại Thành phố Hồ Chí Minh.'
+  - '4. CÁC NGÀNH ĐÀO TẠO
+    a. ĐẠI HỌC
+    Cử nhân Sư phạm Toán học (Hệ Chính quy, Hệ Vừa làm vừa học)
+    b.SAU ĐẠI HỌC
+    Thạc sĩ Toán giải tích
+    Thạc sĩ Đại số và Lý thuyết số
+    Thạc sĩ Hình học và Tôpô
+    Thạc sĩ Lý luận và Phương pháp dạy học bộ môn Toán
+    Tiến sĩ Toán Giải tích
+    Tiến sĩ Hình học và Tôpô
+    Tiến sĩ Lý luận và Phương pháp dạy học bộ môn Toán
+    c. BỒI DƯỠNG
+    Chuyên đề bồi dưỡng cho giáo viên tiểu học, trung học cơ sở và trung học phổ thông
+    về phương pháp, kĩ thuật dạy học, nội dung dạy học, kiểm tra, đánh giá, ứng dụng
+    công nghệ thông tin trong dạy học,…
+    vi. Khoa Công nghệ Thông tin
+    1. CHẤT LƯỢNG ĐÀO TẠO
+    ĐÀO TẠO CỬ NHÂN (4 NĂM)
+    Sư phạm Tin học: 90 – 100 SV/năm
+    Công nghệ Thông tin: 180 – 200 SV/năm
+    ĐÀO TẠO CAO HỌC (2 NĂM)
+    Thạc sĩ Khoa học máy tính: 15-35 HV/ năm
+    2. CHẤT LƯỢNG GIẢNG VIÊN
+    ĐỘI NGŨ GIẢNG VIÊN: 24
+    Tiến sĩ: 9
+    Thạc sĩ: 15
+    3. MỤC TIÊU ĐÀO TẠO
+    Đào tạo giáo viên dạy Tin học bậc phổ thông có trình độ cử nhân Sư phạm Tin học,
+    có phẩm chất chính trị, đạo đức và sức khỏe tốt, hiểu và vận dụng các tri thức
+    cơ bản của Tin học; Lý luận và phương pháp giảng dạy Tin học ở trường trung học,
+    tiểu học. Sau khi tốt nghiệp, người học có đủ năng lực để giảng dạy Tin học tại
+    các trường trung học, tiểu học và một số cơ sở giáo dục tương đương. Đào tạo cử
+    nhân Công nghệ thông tin, có phẩm chất chính trị, đạo đức và sức khỏe tốt, hiểu
+    và vận dụng các tri thức cơ bản về khoa học máy tính. Sau khi tốt nghiệp, người
+    học có đủ năng lực để làm việc trong môi trường các cơ sở sản xuất, các viện hoặc
+    trung tâm nghiên cứu trong lĩnh vực Công nghệ thông tin cũng như có thể tiếp tục
+    theo các bậc học cao hơn.'
+- source_sentence: Xin hãy liệt kê các trung tâm của Trường Đại học Sư phạm Kỹ thuật
+    TP. Hồ Chí Minh.
+  sentences:
+  - 'Nếu có thắc mắc thí sinh vui lòng liên hệ số điện thoại hỗ trợ tuyển sinh: 19002024'
+  - 'Thực hiện hướng dẫn của Bộ Giáo dục và Đào tạo tại Công văn số 1919/BGDĐT-GDĐH
+    ngày 28 tháng 4 năm 2023, phương thức xét tuyển kết quả điểm thi tốt nghiệp Trung
+    học phổ thông vẫn được giữ nguyên như năm 2022. Tổ hợp môn xét tuyển: B00 (Toán
+    – Hóa – Sinh) chung cho tất cả các ngành. năm 2022, Trường Đại học Y khoa Phạm
+    Ngọc Thạch tuyển được 1.367 chỉ tiêu (đạt 104,4% so với chỉ tiêu đề ra). chỉ tiêu
+    tuyển sinh đại học chính quy của Trường Đại học Y khoa Phạm Ngọc Thạch năm 2023.
+    1. Y khoa: 660 2. Dược học: 90 3. Điều dưỡng: 250 4. Dinh dưỡng: 60 5. Răng Hàm
+    Mặt: 90 6. Kỹ thuật xét nghiệm y học: 50 7. Kỹ thuật hình ảnh y học: 40 8. Kỹ
+    thuật phục hồi chức năng: 30 9. Khúc xạ nhãn khoa: 40 10. Y tế công cộng: 56
+    Ghi chú: chỉ tiêu được chia cho các thí sinh có hộ khẩu ở TP HCM và ngoài TP HCM
+    với tỉ lệ 50%
+    Điểm chuẩn của trường Đại học Y khoa Phạm Ngọc Thạch 2023: Y khoa, Điểm chuẩn
+    thí sinh có hộ khẩu tại TP HCM(TP): 25,90, Điểm chuẩn thí sinh có hộ khẩu ngoài
+    TP HCM(TQ): 26.31 Dược học, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 25,28,
+    Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 25,25 Điều dưỡng, Điểm chuẩn
+    thí sinh có hộ khẩu tại TP HCM(TP): 22,40, Điểm chuẩn thí sinh có hộ khẩu ngoài
+    TP HCM(TQ): 22,40 Dinh dưỡng, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 22,25,
+    Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 22,80 Răng - Hàm - Mặt, Điểm
+    chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 26,00, Điểm chuẩn thí sinh có hộ khẩu
+    ngoài TP HCM(TQ): 26,28 Kỹ thuật Xét nghiệm Y học, Điểm chuẩn thí sinh có hộ khẩu
+    tại TP HCM(TP): 24,54, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 24,47
+    Kỹ thuật Hình ảnh Y học, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 23,45,
+    Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 23,61 Khúc xạ nhãn khoa, Điểm
+    chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 23,75, Điểm chuẩn thí sinh có hộ khẩu
+    ngoài TP HCM(TQ): 23,75 Y tế công cộng, Điểm chuẩn thí sinh có hộ khẩu tại TP
+    HCM(TP): 18,85, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 18,35 Kỹ thuật
+    Phục hồi chức năng, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 23,15, Điểm
+    chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 23,09'
+  - 'Phòng Đào tạo
+    2. Phòng Đào tạo không chính quy
+    3. Phòng Tuyển sinh và Công tác Sinh viên
+    4. Phòng Truyền thông
+    5. Phòng Khoa học Công nghệ - Quan hệ Quốc tế
+    6. Phòng Quan hệ Doanh nghiệp
+    7. Phòng Thanh tra - Giáo dục
+    8. Phòng Đảm bảo Chất lượng
+    9. Phòng Tổ chức - Hành chính
+    10. Phòng Kế hoạch - Tài chính
+    11. Phòng Quản trị Cơ sở Vật chất
+    12. Phòng Thiết bị - Vật tư
+    13. Ban quản lý KTX
+    14. Trạm Y tế
+    15. Bộ phận Quản lý Hồ sơ Dự án
+    C. Danh sách các trung tâm của Trường Đại học Sư phạm Kỹ thuật Thành phố Hồ Chí
+    Minh:
+    1. Ngoại ngữ
+    2. Tin học
+    3. Thư viện
+    4. Hợp tác Đào tạo Quốc tế
+    5. Việt – Đức
+    6. Dịch vụ Sinh viên
+    7. Thông tin – Máy tính
+    8. Dạy học số
+    9. Kỹ thuật Tổng hợp
+    10. Chế tạo và Thiết kế Thiết bị Công nghiệp
+    11. Đào tạo và hướng nghiệp quốc tế Việt Nhật
+    12. Đào tạo ngắn hạn
+    13. Giáo dục Thể chất - Quốc phòng
+    14. Đào tạo Bồi dưỡng giáo viên phổ thông, giáo dục nghề nghiệp miền Trung - Tây
+    Nguyên
+    15. Nghiên cứu và Ứng dụng Kỹ thuật Xây dựng
+    16. Bồi dưỡng và Đánh giá kỹ năng nghề Quốc gia
+    17. Phát triển ngôn ngữ
+    18. Nghiên cứu và Chuyển giao Công nghệ
+    19. Công nghệ phần mềm
+    20. Hàn ngữ học Dong A
+    21. Sáng tạo và Khởi nghiệp
+    22. Trung tâm hướng nghiệp và đào tạo Việt Nhật
+    D. Các ngành đào tạo trình độ đại học
+    Đi cùng với sự vận động và phát triển của nền kinh tế đất nước theo hướng công
+    nghiệp hóa, hiện đại hóa, Trường Đại học Sư phạm Kỹ thuật Tp. Hồ Chí Minh đã tiếp
+    cận thực tế để mở rộng đào tạo gần 30 ngành đào tạo trình độ đại học
+    i.'
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+metrics:
+- cosine_accuracy@1
+- cosine_accuracy@3
+- cosine_accuracy@5
+- cosine_accuracy@10
+- cosine_precision@1
+- cosine_precision@3
+- cosine_precision@5
+- cosine_precision@10
+- cosine_recall@1
+- cosine_recall@3
+- cosine_recall@5
+- cosine_recall@10
+- cosine_ndcg@10
+- cosine_mrr@10
+- cosine_map@100
+model-index:
+- name: SentenceTransformer based on dangvantuan/vietnamese-document-embedding
+  results:
+  - task:
+      type: information-retrieval
+      name: Information Retrieval
+    dataset:
+      name: dim 768
+      type: dim_768
+    metrics:
+    - type: cosine_accuracy@1
+      value: 0.6759510869565217
+      name: Cosine Accuracy@1
+    - type: cosine_accuracy@3
+      value: 0.9001358695652174
+      name: Cosine Accuracy@3
+    - type: cosine_accuracy@5
+      value: 0.9483695652173914
+      name: Cosine Accuracy@5
+    - type: cosine_accuracy@10
+      value: 0.985733695652174
+      name: Cosine Accuracy@10
+    - type: cosine_precision@1
+      value: 0.6759510869565217
+      name: Cosine Precision@1
+    - type: cosine_precision@3
+      value: 0.30004528985507245
+      name: Cosine Precision@3
+    - type: cosine_precision@5
+      value: 0.18967391304347825
+      name: Cosine Precision@5
+    - type: cosine_precision@10
+      value: 0.09857336956521738
+      name: Cosine Precision@10
+    - type: cosine_recall@1
+      value: 0.6759510869565217
+      name: Cosine Recall@1
+    - type: cosine_recall@3
+      value: 0.9001358695652174
+      name: Cosine Recall@3
+    - type: cosine_recall@5
+      value: 0.9483695652173914
+      name: Cosine Recall@5
+    - type: cosine_recall@10
+      value: 0.985733695652174
+      name: Cosine Recall@10
+    - type: cosine_ndcg@10
+      value: 0.8420877438453158
+      name: Cosine Ndcg@10
+    - type: cosine_mrr@10
+      value: 0.794684912008282
+      name: Cosine Mrr@10
+    - type: cosine_map@100
+      value: 0.7957360881986503
+      name: Cosine Map@100
+  - task:
+      type: information-retrieval
+      name: Information Retrieval
+    dataset:
+      name: dim 512
+      type: dim_512
+    metrics:
+    - type: cosine_accuracy@1
+      value: 0.6827445652173914
+      name: Cosine Accuracy@1
+    - type: cosine_accuracy@3
+      value: 0.9137228260869565
+      name: Cosine Accuracy@3
+    - type: cosine_accuracy@5
+      value: 0.9565217391304348
+      name: Cosine Accuracy@5
+    - type: cosine_accuracy@10
+      value: 0.9898097826086957
+      name: Cosine Accuracy@10
+    - type: cosine_precision@1
+      value: 0.6827445652173914
+      name: Cosine Precision@1
+    - type: cosine_precision@3
+      value: 0.30457427536231885
+      name: Cosine Precision@3
+    - type: cosine_precision@5
+      value: 0.19130434782608693
+      name: Cosine Precision@5
+    - type: cosine_precision@10
+      value: 0.09898097826086956
+      name: Cosine Precision@10
+    - type: cosine_recall@1
+      value: 0.6827445652173914
+      name: Cosine Recall@1
+    - type: cosine_recall@3
+      value: 0.9137228260869565
+      name: Cosine Recall@3
+    - type: cosine_recall@5
+      value: 0.9565217391304348
+      name: Cosine Recall@5
+    - type: cosine_recall@10
+      value: 0.9898097826086957
+      name: Cosine Recall@10
+    - type: cosine_ndcg@10
+      value: 0.848423953670157
+      name: Cosine Ndcg@10
+    - type: cosine_mrr@10
+      value: 0.8016792292098005
+      name: Cosine Mrr@10
+    - type: cosine_map@100
+      value: 0.8024031231126366
+      name: Cosine Map@100
+  - task:
+      type: information-retrieval
+      name: Information Retrieval
+    dataset:
+      name: dim 256
+      type: dim_256
+    metrics:
+    - type: cosine_accuracy@1
+      value: 0.6813858695652174
+      name: Cosine Accuracy@1
+    - type: cosine_accuracy@3
+      value: 0.9157608695652174
+      name: Cosine Accuracy@3
+    - type: cosine_accuracy@5
+      value: 0.9599184782608695
+      name: Cosine Accuracy@5
+    - type: cosine_accuracy@10
+      value: 0.9891304347826086
+      name: Cosine Accuracy@10
+    - type: cosine_precision@1
+      value: 0.6813858695652174
+      name: Cosine Precision@1
+    - type: cosine_precision@3
+      value: 0.3052536231884058
+      name: Cosine Precision@3
+    - type: cosine_precision@5
+      value: 0.19198369565217388
+      name: Cosine Precision@5
+    - type: cosine_precision@10
+      value: 0.09891304347826084
+      name: Cosine Precision@10
+    - type: cosine_recall@1
+      value: 0.6813858695652174
+      name: Cosine Recall@1
+    - type: cosine_recall@3
+      value: 0.9157608695652174
+      name: Cosine Recall@3
+    - type: cosine_recall@5
+      value: 0.9599184782608695
+      name: Cosine Recall@5
+    - type: cosine_recall@10
+      value: 0.9891304347826086
+      name: Cosine Recall@10
+    - type: cosine_ndcg@10
+      value: 0.848428744710359
+      name: Cosine Ndcg@10
+    - type: cosine_mrr@10
+      value: 0.8016997174775718
+      name: Cosine Mrr@10
+    - type: cosine_map@100
+      value: 0.8024753262882551
+      name: Cosine Map@100
+---
+# SentenceTransformer based on dangvantuan/vietnamese-document-embedding
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [dangvantuan/vietnamese-document-embedding](https://huggingface.co/dangvantuan/vietnamese-document-embedding). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [dangvantuan/vietnamese-document-embedding](https://huggingface.co/dangvantuan/vietnamese-document-embedding) <!-- at revision 6fa4e2f8ed2d33120b0f4442cc81f8f973c3f56b -->
+- **Maximum Sequence Length:** 8192 tokens
+- **Output Dimensionality:** 768 dimensions
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'VietnameseModel'})
+  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("HoangVuSnape/vietnamese-document-embedding_pr_v2_ep30_new")
+# Run inference
+sentences = [
+    'Xin hãy liệt kê các trung tâm của Trường Đại học Sư phạm Kỹ thuật TP. Hồ Chí Minh.',
+    'Phòng Đào tạo\n\n2. Phòng Đào tạo không chính quy\n\n3. Phòng Tuyển sinh và Công tác Sinh viên\n\n4. Phòng Truyền thông\n\n5. Phòng Khoa học Công nghệ - Quan hệ Quốc tế\n\n6. Phòng Quan hệ Doanh nghiệp\n\n7. Phòng Thanh tra - Giáo dục\n\n8. Phòng Đảm bảo Chất lượng\n\n9. Phòng Tổ chức - Hành chính\n\n10. Phòng Kế hoạch - Tài chính\n\n11. Phòng Quản trị Cơ sở Vật chất\n\n12. Phòng Thiết bị - Vật tư\n\n13. Ban quản lý KTX\n\n14. Trạm Y tế\n\n15. Bộ phận Quản lý Hồ sơ Dự án\n\nC. Danh sách các trung tâm của Trường Đại học Sư phạm Kỹ thuật Thành phố Hồ Chí Minh:\n\n1. Ngoại ngữ\n\n2. Tin học\n\n3. Thư viện\n\n4. Hợp tác Đào tạo Quốc tế\n\n5. Việt – Đức\n\n6. Dịch vụ Sinh viên\n\n7. Thông tin – Máy tính\n\n8. Dạy học số\n\n9. Kỹ thuật Tổng hợp\n\n10. Chế tạo và Thiết kế Thiết bị Công nghiệp\n\n11. Đào tạo và hướng nghiệp quốc tế Việt Nhật\n\n12. Đào tạo ngắn hạn\n\n13. Giáo dục Thể chất - Quốc phòng\n\n14. Đào tạo Bồi dưỡng giáo viên phổ thông, giáo dục nghề nghiệp miền Trung - Tây Nguyên\n\n15. Nghiên cứu và Ứng dụng Kỹ thuật Xây dựng\n\n16. Bồi dưỡng và Đánh giá kỹ năng nghề Quốc gia\n\n17. Phát triển ngôn ngữ\n\n18. Nghiên cứu và Chuyển giao Công nghệ\n\n19. Công nghệ phần mềm\n\n20. Hàn ngữ học Dong A\n\n21. Sáng tạo và Khởi nghiệp\n\n22. Trung tâm hướng nghiệp và đào tạo Việt Nhật\n\nD. Các ngành đào tạo trình độ đại học\n\nĐi cùng với sự vận động và phát triển của nền kinh tế đất nước theo hướng công nghiệp hóa, hiện đại hóa, Trường Đại học Sư phạm Kỹ thuật Tp. Hồ Chí Minh đã tiếp cận thực tế để mở rộng đào tạo gần 30 ngành đào tạo trình độ đại học\n\ni.',
+    'Thực hiện hướng dẫn của Bộ Giáo dục và Đào tạo tại Công văn số 1919/BGDĐT-GDĐH ngày 28 tháng 4 năm 2023, phương thức xét tuyển kết quả điểm thi tốt nghiệp Trung học phổ thông vẫn được giữ nguyên như năm 2022. Tổ hợp môn xét tuyển: B00 (Toán – Hóa – Sinh) chung cho tất cả các ngành. năm 2022, Trường Đại học Y khoa Phạm Ngọc Thạch tuyển được 1.367 chỉ tiêu (đạt 104,4% so với chỉ tiêu đề ra). chỉ tiêu tuyển sinh đại học chính quy của Trường Đại học Y khoa Phạm Ngọc Thạch năm 2023. 1. Y khoa: 660 2. Dược học: 90 3. Điều dưỡng: 250 4. Dinh dưỡng: 60 5. Răng Hàm Mặt: 90 6. Kỹ thuật xét nghiệm y học: 50 7. Kỹ thuật hình ảnh y học: 40 8. Kỹ thuật phục hồi chức năng: 30 9. Khúc xạ nhãn khoa: 40 10. Y tế công cộng: 56\n\nGhi chú: chỉ tiêu được chia cho các thí sinh có hộ khẩu ở TP HCM và ngoài TP HCM với tỉ lệ 50%\n\nĐiểm chuẩn của trường Đại học Y khoa Phạm Ngọc Thạch 2023: Y khoa, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 25,90, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 26.31 Dược học, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 25,28, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 25,25 Điều dưỡng, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 22,40, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 22,40 Dinh dưỡng, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 22,25, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 22,80 Răng - Hàm - Mặt, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 26,00, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 26,28 Kỹ thuật Xét nghiệm Y học, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 24,54, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 24,47 Kỹ thuật Hình ảnh Y học, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 23,45, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 23,61 Khúc xạ nhãn khoa, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 23,75, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 23,75 Y tế công cộng, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 18,85, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 18,35 Kỹ thuật Phục hồi chức năng, Điểm chuẩn thí sinh có hộ khẩu tại TP HCM(TP): 23,15, Điểm chuẩn thí sinh có hộ khẩu ngoài TP HCM(TQ): 23,09',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 768]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities)
+# tensor([[1.0000, 0.8413, 0.0106],
+#         [0.8413, 1.0000, 0.0259],
+#         [0.0106, 0.0259, 1.0000]])
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+## Evaluation
+### Metrics
+#### Information Retrieval
+* Dataset: `dim_768`
+* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
+  ```json
+  {
+      "truncate_dim": 768
+  }
+  ```
+| Metric              | Value      |
+|:--------------------|:-----------|
+| cosine_accuracy@1   | 0.676      |
+| cosine_accuracy@3   | 0.9001     |
+| cosine_accuracy@5   | 0.9484     |
+| cosine_accuracy@10  | 0.9857     |
+| cosine_precision@1  | 0.676      |
+| cosine_precision@3  | 0.3        |
+| cosine_precision@5  | 0.1897     |
+| cosine_precision@10 | 0.0986     |
+| cosine_recall@1     | 0.676      |
+| cosine_recall@3     | 0.9001     |
+| cosine_recall@5     | 0.9484     |
+| cosine_recall@10    | 0.9857     |
+| **cosine_ndcg@10**  | **0.8421** |
+| cosine_mrr@10       | 0.7947     |
+| cosine_map@100      | 0.7957     |
+#### Information Retrieval
+* Dataset: `dim_512`
+* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
+  ```json
+  {
+      "truncate_dim": 512
+  }
+  ```
+| Metric              | Value      |
+|:--------------------|:-----------|
+| cosine_accuracy@1   | 0.6827     |
+| cosine_accuracy@3   | 0.9137     |
+| cosine_accuracy@5   | 0.9565     |
+| cosine_accuracy@10  | 0.9898     |
+| cosine_precision@1  | 0.6827     |
+| cosine_precision@3  | 0.3046     |
+| cosine_precision@5  | 0.1913     |
+| cosine_precision@10 | 0.099      |
+| cosine_recall@1     | 0.6827     |
+| cosine_recall@3     | 0.9137     |
+| cosine_recall@5     | 0.9565     |
+| cosine_recall@10    | 0.9898     |
+| **cosine_ndcg@10**  | **0.8484** |
+| cosine_mrr@10       | 0.8017     |
+| cosine_map@100      | 0.8024     |
+#### Information Retrieval
+* Dataset: `dim_256`
+* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
+  ```json
+  {
+      "truncate_dim": 256
+  }
+  ```
+| Metric              | Value      |
+|:--------------------|:-----------|
+| cosine_accuracy@1   | 0.6814     |
+| cosine_accuracy@3   | 0.9158     |
+| cosine_accuracy@5   | 0.9599     |
+| cosine_accuracy@10  | 0.9891     |
+| cosine_precision@1  | 0.6814     |
+| cosine_precision@3  | 0.3053     |
+| cosine_precision@5  | 0.192      |
+| cosine_precision@10 | 0.0989     |
+| cosine_recall@1     | 0.6814     |
+| cosine_recall@3     | 0.9158     |
+| cosine_recall@5     | 0.9599     |
+| cosine_recall@10    | 0.9891     |
+| **cosine_ndcg@10**  | **0.8484** |
+| cosine_mrr@10       | 0.8017     |
+| cosine_map@100      | 0.8025     |
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 1,472 training samples
+* Columns: <code>anchor</code> and <code>positive</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | anchor                                                                            | positive                                                                              |
+  |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                                |
+  | details | <ul><li>min: 9 tokens</li><li>mean: 25.49 tokens</li><li>max: 62 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 559.43 tokens</li><li>max: 6602 tokens</li></ul> |
+* Samples:
+  | anchor                                                                                                                                                     | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+  |:-----------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>Ngành Quản lý Tài nguyên và Môi trường trang bị cho sinh viên những kiến thức và kỹ năng gì?</code>                                                  | <code>Sau khi tốt nghiệp, người học sẽ:<br><br>Có kiến thức cơ bản về toán học, khoa học tự nhiên, đáp ứng cho việc tiếp thu các kiến thức giáo dục chuyên nghiệp và khả năng học tập ở trình độ cao hơn<br><br>Có các kiến thức kỹ thuật cơ sở ngành và chuyên ngành giúp đủ năng lực phát hiện, giải quyết các vấn đề liên quan đến công nghệ sản xuất, chế tạo và ứng dụng vật liệu vào trong xây dựng, kiểm soát chất lượng nguyên vật liệu và cấu kiện sản phẩm xây dựng, nghiên cứu sản xuất chế tạo và phát triển các loại vật liệu mới, hiện đại, tiên tiến, độc đáo, hiệu quả, xanh, bền vững… nhằm hướng tới sự phát triển bền vững trong công nghiệp xây dựng và kiến trúc, thiết kế và thi công trong các công trình xây dựng; có tính sáng tạo trong hoạt động nghề nghiệp, có khả năng tự học và tự nghiên cứu;<br><br>Có kỹ năng cá nhân, nghề nghiệp, giao tiếp, làm việc nhóm đủ để làm việc trong môi trường làm việc liên ngành, đa văn hóa;<br><br>Có hiểu biết về kinh tế, chính trị, có các kiến thức cơ bản trong lĩnh vực khoa học xã hội và n...</code> |
+  | <code>Chương trình Kỹ thuật Môi trường đào tạo sinh viên về những năng lực nào và có điểm gì nổi bật đối với chương trình giảng dạy bằng tiếng Anh?</code> | <code>Sau khi tốt nghiệp, người học sẽ:<br><br>Có kiến thức cơ bản về toán học, khoa học tự nhiên, đáp ứng cho việc tiếp thu các kiến thức giáo dục chuyên nghiệp và khả năng học tập ở trình độ cao hơn<br><br>Có các kiến thức kỹ thuật cơ sở ngành và chuyên ngành giúp đủ năng lực phát hiện, giải quyết các vấn đề liên quan đến công nghệ sản xuất, chế tạo và ứng dụng vật liệu vào trong xây dựng, kiểm soát chất lượng nguyên vật liệu và cấu kiện sản phẩm xây dựng, nghiên cứu sản xuất chế tạo và phát triển các loại vật liệu mới, hiện đại, tiên tiến, độc đáo, hiệu quả, xanh, bền vững… nhằm hướng tới sự phát triển bền vững trong công nghiệp xây dựng và kiến trúc, thiết kế và thi công trong các công trình xây dựng; có tính sáng tạo trong hoạt động nghề nghiệp, có khả năng tự học và tự nghiên cứu;<br><br>Có kỹ năng cá nhân, nghề nghiệp, giao tiếp, làm việc nhóm đủ để làm việc trong môi trường làm việc liên ngành, đa văn hóa;<br><br>Có hiểu biết về kinh tế, chính trị, có các kiến thức cơ bản trong lĩnh vực khoa học xã hội và n...</code> |
+  | <code>Ngành Kỹ thuật Dầu khí và Kỹ thuật Địa chất tập trung nghiên cứu và ứng dụng những lĩnh vực cốt lõi nào?</code>                                      | <code>Các công ty nghiên cứu và khảo sát địa chất, tư vấn về nền móng công trình. Các tổ chức liên quan đến quy hoạch và phát triển đô thị. Kỹ thuật Dầu khí<br><br>Tổng quan<br><br>Kỹ thuật Dầu khí là ngành học chuyên nghiên cứu về các kỹ thuật khai thác, sản xuất và xử lý dầu khí. Sinh viên sẽ học các phương pháp khoan, khai thác dầu, khí tự nhiên, và xử lý các vấn đề kỹ thuật trong ngành dầu khí, từ việc tìm kiếm và khai thác tài nguyên cho đến việc tối ưu hóa quy trình sản xuất. CÁC ĐIỂM ĐẶC BIỆT<br><br>Khả năng ứng dụng cao: Sinh viên ngành Kỹ thuật Dầu khí sẽ được trang bị kiến thức thực tế về công nghệ khai thác dầu khí và các phương pháp tối ưu hóa sản xuất. Ngành công nghiệp chiến lược: Dầu khí vẫn là một trong những ngành công nghiệp mũi nhọn và cần nguồn nhân lực có trình độ cao trong việc khai thác và xử lý tài nguyên thiên nhiên. Triển vọng việc làm<br><br>Các công ty khai thác dầu khí trong nước và quốc tế. Các công ty tư vấn và kỹ thuật dầu khí, nghiên cứu các giải pháp tối ưu trong khai thác. Các côn...</code> |
+* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
+  ```json
+  {
+      "loss": "MultipleNegativesRankingLoss",
+      "matryoshka_dims": [
+          768,
+          512,
+          256,
+          12
+      ],
+      "matryoshka_weights": [
+          1,
+          1,
+          1,
+          1
+      ],
+      "n_dims_per_step": -1
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `gradient_accumulation_steps`: 8
+- `learning_rate`: 2e-05
+- `num_train_epochs`: 30
+- `lr_scheduler_type`: cosine
+- `warmup_ratio`: 0.1
+- `bf16`: True
+- `tf32`: True
+- `dataloader_drop_last`: True
+- `dataloader_num_workers`: 8
+- `load_best_model_at_end`: True
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 8
+- `per_device_eval_batch_size`: 8
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 8
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 2e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 30
+- `max_steps`: -1
+- `lr_scheduler_type`: cosine
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: True
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: True
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: True
+- `dataloader_num_workers`: 8
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: True
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch_fused
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `hub_revision`: None
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `liger_kernel_config`: None
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+- `router_mapping`: {}
+- `learning_rate_mapping`: {}
+</details>
+### Training Logs
+| Epoch   | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 |
+|:-------:|:----:|:-------------:|:----------------------:|:----------------------:|:----------------------:|
+| -1      | -1   | -             | 0.4980                 | 0.4994                 | 0.4823                 |
+| 0.4348  | 10   | 3.6503        | 0.5158                 | 0.5133                 | 0.4978                 |
+| 0.8696  | 20   | 2.7131        | 0.5417                 | 0.5388                 | 0.5257                 |
+| 1.3043  | 30   | 2.2307        | 0.5621                 | 0.5637                 | 0.5534                 |
+| 1.7391  | 40   | 2.1341        | 0.5831                 | 0.5840                 | 0.5757                 |
+| 2.1739  | 50   | 1.8576        | 0.6081                 | 0.6077                 | 0.5999                 |
+| 2.6087  | 60   | 1.4278        | 0.6271                 | 0.6278                 | 0.6192                 |
+| 3.0435  | 70   | 1.3602        | 0.6396                 | 0.6412                 | 0.6335                 |
+| 3.4783  | 80   | 1.1086        | 0.6528                 | 0.6550                 | 0.6490                 |
+| 3.9130  | 90   | 0.9657        | 0.6675                 | 0.6704                 | 0.6658                 |
+| 4.3478  | 100  | 0.7836        | 0.6811                 | 0.6777                 | 0.6750                 |
+| 4.7826  | 110  | 0.6755        | 0.6892                 | 0.6883                 | 0.6881                 |
+| 5.2174  | 120  | 0.6679        | 0.6935                 | 0.6955                 | 0.6968                 |
+| 5.6522  | 130  | 0.7005        | 0.7054                 | 0.7078                 | 0.7084                 |
+| 6.0870  | 140  | 0.5895        | 0.7161                 | 0.7161                 | 0.7154                 |
+| 6.5217  | 150  | 0.4809        | 0.7233                 | 0.7232                 | 0.7205                 |
+| 6.9565  | 160  | 0.5287        | 0.7240                 | 0.7260                 | 0.7249                 |
+| 7.3913  | 170  | 0.4976        | 0.7375                 | 0.7404                 | 0.7372                 |
+| 7.8261  | 180  | 0.3886        | 0.7390                 | 0.7418                 | 0.7404                 |
+| 8.2609  | 190  | 0.5025        | 0.7481                 | 0.7531                 | 0.7516                 |
+| 8.6957  | 200  | 0.4322        | 0.7531                 | 0.7568                 | 0.7604                 |
+| 9.1304  | 210  | 0.3929        | 0.7563                 | 0.7616                 | 0.7607                 |
+| 9.5652  | 220  | 0.3131        | 0.7561                 | 0.7647                 | 0.7648                 |
+| 10.0    | 230  | 0.4091        | 0.7568                 | 0.7592                 | 0.7616                 |
+| 10.4348 | 240  | 0.3219        | 0.7557                 | 0.7604                 | 0.7643                 |
+| 10.8696 | 250  | 0.3227        | 0.7677                 | 0.7728                 | 0.7774                 |
+| 11.3043 | 260  | 0.3406        | 0.7742                 | 0.7800                 | 0.7850                 |
+| 11.7391 | 270  | 0.2998        | 0.7759                 | 0.7816                 | 0.7845                 |
+| 12.1739 | 280  | 0.2681        | 0.7766                 | 0.7824                 | 0.7867                 |
+| 12.6087 | 290  | 0.2621        | 0.7774                 | 0.7840                 | 0.7839                 |
+| 13.0435 | 300  | 0.3037        | 0.7782                 | 0.7817                 | 0.7863                 |
+| 13.4783 | 310  | 0.3236        | 0.7911                 | 0.7949                 | 0.7958                 |
+| 13.9130 | 320  | 0.2847        | 0.7962                 | 0.8013                 | 0.8026                 |
+| 14.3478 | 330  | 0.3139        | 0.7983                 | 0.8007                 | 0.8068                 |
+| 14.7826 | 340  | 0.2783        | 0.7994                 | 0.8025                 | 0.8081                 |
+| 15.2609 | 350  | 0.2623        | 0.8041                 | 0.8087                 | 0.8102                 |
+| 15.6957 | 360  | 0.2617        | 0.8102                 | 0.8105                 | 0.8149                 |
+| 16.1304 | 370  | 0.2566        | 0.8132                 | 0.8177                 | 0.8205                 |
+| 16.5652 | 380  | 0.2296        | 0.8166                 | 0.8206                 | 0.8236                 |
+| 17.0    | 390  | 0.2334        | 0.8179                 | 0.8231                 | 0.8236                 |
+| 17.4348 | 400  | 0.2386        | 0.8205                 | 0.8249                 | 0.8274                 |
+| 17.8696 | 410  | 0.1751        | 0.8241                 | 0.8261                 | 0.8300                 |
+| 18.3043 | 420  | 0.2488        | 0.8229                 | 0.8263                 | 0.8323                 |
+| 18.7391 | 430  | 0.239         | 0.8272                 | 0.8294                 | 0.8344                 |
+| 19.1739 | 440  | 0.2231        | 0.8329                 | 0.8335                 | 0.8360                 |
+| 19.6087 | 450  | 0.2516        | 0.8341                 | 0.8352                 | 0.8411                 |
+| 20.0435 | 460  | 0.2544        | 0.8325                 | 0.8385                 | 0.8425                 |
+| 20.4783 | 470  | 0.2082        | 0.8348                 | 0.8407                 | 0.8457                 |
+| 20.9130 | 480  | 0.1868        | 0.8361                 | 0.8414                 | 0.8460                 |
+| 21.3478 | 490  | 0.2454        | 0.8361                 | 0.8437                 | 0.8454                 |
+| 21.7826 | 500  | 0.222         | 0.8343                 | 0.8435                 | 0.8462                 |
+| 22.2174 | 510  | 0.1554        | 0.8348                 | 0.8430                 | 0.8461                 |
+| 22.6522 | 520  | 0.14          | 0.8352                 | 0.8416                 | 0.8454                 |
+| 23.0870 | 530  | 0.1867        | 0.8357                 | 0.8422                 | 0.8463                 |
+| 23.5217 | 540  | 0.2078        | 0.8361                 | 0.8441                 | 0.8449                 |
+| 23.9565 | 550  | 0.1929        | 0.8370                 | 0.8437                 | 0.8437                 |
+| 24.3913 | 560  | 0.1776        | 0.8380                 | 0.8435                 | 0.8428                 |
+| 24.8261 | 570  | 0.2524        | 0.8387                 | 0.8448                 | 0.8449                 |
+| 25.2609 | 580  | 0.1914        | 0.8406                 | 0.8465                 | 0.8458                 |
+| 25.6957 | 590  | 0.1841        | 0.8414                 | 0.8468                 | 0.8471                 |
+| 26.1304 | 600  | 0.165         | 0.8423                 | 0.8476                 | 0.8468                 |
+| 26.5652 | 610  | 0.1717        | 0.8417                 | 0.8489                 | 0.8492                 |
+| 27.0    | 620  | 0.2091        | 0.8414                 | 0.8488                 | 0.8484                 |
+| 27.4348 | 630  | 0.1889        | 0.8414                 | 0.8487                 | 0.8486                 |
+| 27.8696 | 640  | 0.2025        | 0.8418                 | 0.8486                 | 0.8483                 |
+| 28.3043 | 650  | 0.1722        | 0.8415                 | 0.8490                 | 0.8488                 |
+| 28.7391 | 660  | 0.1621        | 0.8418                 | 0.8483                 | 0.8490                 |
+| 29.1739 | 670  | 0.1651        | 0.8422                 | 0.8481                 | 0.8492                 |
+| 29.6087 | 680  | 0.1837        | 0.8421                 | 0.8484                 | 0.8484                 |
+### Framework Versions
+- Python: 3.10.12
+- Sentence Transformers: 5.1.0
+- Transformers: 4.55.2
+- PyTorch: 2.8.0+cu128
+- Accelerate: 1.10.0
+- Datasets: 4.0.0
+- Tokenizers: 0.21.4
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MatryoshkaLoss
+```bibtex
+@misc{kusupati2024matryoshka,
+    title={Matryoshka Representation Learning},
+    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
+    year={2024},
+    eprint={2205.13147},
+    archivePrefix={arXiv},
+    primaryClass={cs.LG}
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,49 @@

+{
+  "architectures": [
+    "VietnameseModel"
+  ],
+  "attention_probs_dropout_prob": 0.0,
+  "auto_map": {
+    "AutoConfig": "configuration.VietnameseConfig",
+    "AutoModel": "modeling.VietnameseModel",
+    "AutoModelForMaskedLM": "dangvantuan/Vietnamese_impl--modeling.VietnameseForMaskedLM",
+    "AutoModelForMultipleChoice": "dangvantuan/Vietnamese_impl--modeling.VietnameseForMultipleChoice",
+    "AutoModelForQuestionAnswering": "dangvantuan/Vietnamese_impl--modeling.VietnameseForQuestionAnswering",
+    "AutoModelForSequenceClassification": "dangvantuan/Vietnamese_impl--modeling.VietnameseForSequenceClassification",
+    "AutoModelForTokenClassification": "dangvantuan/Vietnamese_impl--modeling.VietnameseForTokenClassification"
+  },
+  "classifier_dropout": 0.0,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "layer_norm_type": "layer_norm",
+  "logn_attention_clip1": false,
+  "logn_attention_scale": false,
+  "max_position_embeddings": 8192,
+  "model_type": "Vietnamese",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pack_qkv": true,
+  "pad_token_id": 1,
+  "position_embedding_type": "rope",
+  "rope_scaling": {
+    "factor": 8.0,
+    "type": "ntk"
+  },
+  "rope_theta": 20000,
+  "torch_dtype": "float32",
+  "transformers_version": "4.55.2",
+  "type_vocab_size": 1,
+  "unpad_inputs": false,
+  "use_memory_efficient_attention": false,
+  "vocab_size": 250048
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+  "__version__": {
+    "sentence_transformers": "5.1.0",
+    "transformers": "4.55.2",
+    "pytorch": "2.8.0+cu128"
+  },
+  "prompts": {
+    "query": "",
+    "document": ""
+  },
+  "default_prompt_name": null,
+  "model_type": "SentenceTransformer",
+  "similarity_fn_name": "cosine"
+}

configuration.py ADDED Viewed

	@@ -0,0 +1,114 @@

+# limitations under the License.
+""" Vietnamese model configuration"""
+from transformers.configuration_utils import PretrainedConfig
+from transformers.utils import logging
+logger = logging.get_logger(__name__)
+class VietnameseConfig(PretrainedConfig):
+    r"""
+    This is the configuration class to store the configuration of a [`VietnameseModel`] or a [`TFVietnameseModel`]. It is used to
+    instantiate a Vietnamese model according to the specified arguments, defining the model architecture. Instantiating a
+    configuration with the defaults will yield a similar configuration to that of the Vietnamese
+    Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
+    documentation from [`PretrainedConfig`] for more information.
+    Args:
+        vocab_size (`int`, *optional*, defaults to 30522):
+            Vocabulary size of the Vietnamese model. Defines the number of different tokens that can be represented by the
+            `inputs_ids` passed when calling [`VietnameseModel`] or [`TFVietnameseModel`].
+        hidden_size (`int`, *optional*, defaults to 768):
+            Dimensionality of the encoder layers and the pooler layer.
+        num_hidden_layers (`int`, *optional*, defaults to 12):
+            Number of hidden layers in the Transformer encoder.
+        num_attention_heads (`int`, *optional*, defaults to 12):
+            Number of attention heads for each attention layer in the Transformer encoder.
+        intermediate_size (`int`, *optional*, defaults to 3072):
+            Dimensionality of the "intermediate" (often named feed-forward) layer in the Transformer encoder.
+        hidden_act (`str` or `Callable`, *optional*, defaults to `"gelu"`):
+            The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
+            `"relu"`, `"silu"` and `"gelu_Vietnamese"` are supported.
+        hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
+            The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
+        attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
+            The dropout ratio for the attention probabilities.
+        max_position_embeddings (`int`, *optional*, defaults to 512):
+            The maximum sequence length that this model might ever be used with. Typically set this to something large
+            just in case (e.g., 512 or 1024 or 2048).
+        type_vocab_size (`int`, *optional*, defaults to 2):
+            The vocabulary size of the `token_type_ids` passed when calling [`VietnameseModel`] or [`TFVietnameseModel`].
+        initializer_range (`float`, *optional*, defaults to 0.02):
+            The standard deviation of the truncated_normal_initializer for initializing all weight matrices.
+        layer_norm_eps (`float`, *optional*, defaults to 1e-12):
+            The epsilon used by the layer normalization layers.
+        position_embedding_type (`str`, *optional*, defaults to `"rope"`):
+            Type of position embedding. Choose one of `"absolute"`, `"rope"`.
+        rope_theta (`float`, *optional*, defaults to 10000.0):
+            The base period of the RoPE embeddings.
+        rope_scaling (`Dict`, *optional*):
+            Dictionary containing the scaling configuration for the RoPE embeddings. Currently supports two scaling
+            strategies: linear and dynamic. Their scaling factor must be a float greater than 1. The expected format is
+            `{"type": strategy name, "factor": scaling factor}`. When using this flag, don't update
+            `max_position_embeddings` to the expected new maximum. See the following thread for more information on how
+            these scaling strategies behave:
+            https://www.reddit.com/r/LocalLLaMA/comments/14mrgpr/dynamically_scaled_rope_further_increases/. This is an
+            experimental feature, subject to breaking API changes in future versions.
+        classifier_dropout (`float`, *optional*):
+            The dropout ratio for the classification head.
+    Examples:
+    """
+    model_type = "Vietnamese"
+    def __init__(
+        self,
+        vocab_size=30528,
+        hidden_size=768,
+        num_hidden_layers=12,
+        num_attention_heads=12,
+        intermediate_size=3072,
+        hidden_act="gelu",
+        hidden_dropout_prob=0.1,
+        attention_probs_dropout_prob=0.0,
+        max_position_embeddings=2048,
+        type_vocab_size=1,
+        initializer_range=0.02,
+        layer_norm_type='layer_norm',
+        layer_norm_eps=1e-12,
+        # pad_token_id=0,
+        position_embedding_type="rope",
+        rope_theta=10000.0,
+        rope_scaling=None,
+        classifier_dropout=None,
+        pack_qkv=True,
+        unpad_inputs=False,
+        use_memory_efficient_attention=False,
+        logn_attention_scale=False,
+        logn_attention_clip1=False,
+        **kwargs,
+    ):
+        super().__init__(**kwargs)
+        self.vocab_size = vocab_size
+        self.hidden_size = hidden_size
+        self.num_hidden_layers = num_hidden_layers
+        self.num_attention_heads = num_attention_heads
+        self.hidden_act = hidden_act
+        self.intermediate_size = intermediate_size
+        self.hidden_dropout_prob = hidden_dropout_prob
+        self.attention_probs_dropout_prob = attention_probs_dropout_prob
+        self.max_position_embeddings = max_position_embeddings
+        self.type_vocab_size = type_vocab_size
+        self.initializer_range = initializer_range
+        self.layer_norm_type = layer_norm_type
+        self.layer_norm_eps = layer_norm_eps
+        self.position_embedding_type = position_embedding_type
+        self.rope_theta = rope_theta
+        self.rope_scaling = rope_scaling
+        self.classifier_dropout = classifier_dropout
+        self.pack_qkv = pack_qkv
+        self.unpad_inputs = unpad_inputs
+        self.use_memory_efficient_attention = use_memory_efficient_attention
+        self.logn_attention_scale = logn_attention_scale
+        self.logn_attention_clip1 = logn_attention_clip1

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e79c50e3061acfed843b914359897a280664f5eb43410358687eaa4b1725dcd9
+size 1221487872

modeling.py ADDED Viewed

	@@ -0,0 +1,1319 @@

+"""PyTorch Vietnamese model."""
+import math
+from dataclasses import dataclass
+from typing import List, Optional, Tuple, Union
+import torch
+import torch.utils.checkpoint
+from torch import nn
+from transformers.activations import ACT2FN
+from transformers.modeling_outputs import (
+    BaseModelOutput,
+    BaseModelOutputWithPooling,
+    MaskedLMOutput,
+    MultipleChoiceModelOutput,
+    QuestionAnsweringModelOutput,
+    SequenceClassifierOutput,
+    ModelOutput,
+)
+from transformers.modeling_utils import PreTrainedModel
+from transformers.utils import logging
+try:
+    import xformers.ops as xops
+except ImportError as e:
+    xops = None
+from .configuration import VietnameseConfig
+logger = logging.get_logger(__name__)
+# Adapted from https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/bert_padding.py
+# Which was adapted from https://github.com/mlcommons/training_results_v1.1/blob/main/NVIDIA/benchmarks/bert/implementations/pytorch/padding.py
+class IndexFirstAxis(torch.autograd.Function):
+    @staticmethod
+    def forward(ctx, input, indices):
+        ctx.save_for_backward(indices)
+        assert input.ndim >= 2
+        ctx.first_axis_dim, other_shape = input.shape[0], input.shape[1:]
+        second_dim = other_shape.numel()
+        return torch.gather(
+            input.view(ctx.first_axis_dim, second_dim),
+            0,
+            indices.unsqueeze(-1).expand(indices.size(0), second_dim)
+        ).reshape(-1, *other_shape)
+    @staticmethod
+    def backward(ctx, grad_output):
+        (indices,) = ctx.saved_tensors
+        assert grad_output.ndim >= 2
+        other_shape = grad_output.shape[1:]
+        grad_output = grad_output.view(grad_output.size(0), other_shape.numel())
+        grad_input = torch.zeros(
+            [ctx.first_axis_dim, grad_output.shape[1]],
+            device=grad_output.device,
+            dtype=grad_output.dtype,
+        )
+        grad_input.scatter_(
+            0, indices.unsqueeze(-1).expand(indices.size(0), grad_output.size(1)), grad_output
+        )
+        return grad_input.reshape(ctx.first_axis_dim, *other_shape), None
+index_first_axis = IndexFirstAxis.apply
+def unpad_input(hidden_states, attention_mask=None, indices=None):
+    """
+    Arguments:
+        hidden_states: (batch, seqlen, ...)
+        attention_mask: (batch, seqlen), bool / int, 1 means valid and 0 means not valid.
+        indices: (total_nnz), the indices of non-masked tokens from the flattened input sequence.
+    Return:
+        hidden_states: (total_nnz, ...), where total_nnz = number of tokens in selected in attention_mask.
+    """
+    if indices is None:
+        assert attention_mask is not None
+        indices = torch.nonzero(attention_mask.flatten(), as_tuple=False).flatten()
+    hidden_states = hidden_states.view(-1, *hidden_states.shape[2:])
+    return index_first_axis(hidden_states, indices)
+class IndexPutFirstAxis(torch.autograd.Function):
+    @staticmethod
+    def forward(
+        ctx,
+        values: torch.Tensor,
+        indices: torch.Tensor,
+        first_axis_dim
+    ) -> torch.Tensor:
+        ctx.save_for_backward(indices)
+        assert indices.ndim == 1
+        assert values.ndim >= 2
+        output = torch.zeros(
+            first_axis_dim, *values.shape[1:], device=values.device, dtype=values.dtype
+        )
+        output[indices] = values
+        return output
+    @staticmethod
+    def backward(ctx, grad_output: torch.Tensor) -> Tuple[torch.Tensor, None, None]:
+        indices, = ctx.saved_tensors
+        grad_values = grad_output[indices]
+        return grad_values, None, None
+index_put_first_axis = IndexPutFirstAxis.apply
+def pad_input(inputs: torch.Tensor, indices: torch.Tensor, batch: int, seqlen: int) -> torch.Tensor:
+    """Add padding to sequences.
+    Arguments:
+        inputs: (total_nnz, ...), where total_nnz = number of tokens in selected in attention_mask.
+        indices: (total_nnz), `indices = torch.nonzero(attention_mask.flatten(), as_tuple=False).flatten()`
+        batch: int batch_size
+        seqlen: int max sequence length
+    Returns:
+        inputs: (batch, seqlen, ...)
+    """
+    output = index_put_first_axis(inputs, indices, batch * seqlen)
+    return output.view(batch, seqlen, *inputs.shape[1:])
+def rotate_half(x):
+    """Rotates half the hidden dims of the input."""
+    x1 = x[..., : x.shape[-1] // 2]
+    x2 = x[..., x.shape[-1] // 2 :]
+    return torch.cat((-x2, x1), dim=-1)
+def apply_rotary_pos_emb(q, k, cos, sin):
+    """Applies Rotary Position Embedding to the query and key tensors.
+    Args:
+        q (`torch.Tensor`): The query tensor.
+        k (`torch.Tensor`): The key tensor.
+        cos (`torch.Tensor`): The cosine part of the rotary embedding.
+        sin (`torch.Tensor`): The sine part of the rotary embedding.
+    Returns:
+        `tuple(torch.Tensor)` comprising of the query and key tensors rotated using the Rotary Position Embedding.
+    """
+    cos, sin = cos.to(q.dtype), sin.to(q.dtype)
+    q_embed = (q * cos) + (rotate_half(q) * sin)
+    k_embed = (k * cos) + (rotate_half(k) * sin)
+    return q_embed, k_embed
+class RotaryEmbedding(torch.nn.Module):
+    def __init__(self, dim, max_position_embeddings=512, base=10000.0, device=None):
+        super().__init__()
+        self.dim = dim
+        self.max_position_embeddings = max_position_embeddings
+        self.base = base
+        inv_freq = 1.0 / (self.base ** (torch.arange(0, self.dim, 2).float().to(device) / self.dim))
+        self.register_buffer("inv_freq", inv_freq, persistent=False)
+        self._set_cos_sin_cache(
+            seq_len=max_position_embeddings, device=self.inv_freq.device, dtype=torch.get_default_dtype()
+        )
+    def _set_cos_sin_cache(self, seq_len, device, dtype):
+        self.max_seq_len_cached = seq_len
+        t = torch.arange(self.max_seq_len_cached, device=device, dtype=torch.float32)
+        freqs = torch.einsum("i,j->ij", t, self.inv_freq)
+        emb = torch.cat((freqs, freqs), dim=-1)
+        self.register_buffer("cos_cached", emb.cos().to(dtype), persistent=False)
+        self.register_buffer("sin_cached", emb.sin().to(dtype), persistent=False)
+    def forward(self, x, seq_len=None):
+        if seq_len > self.max_seq_len_cached:
+            self._set_cos_sin_cache(seq_len=seq_len, device=x.device, dtype=x.dtype)
+        return (
+            self.cos_cached[:seq_len, ...].to(dtype=x.dtype),
+            self.sin_cached[:seq_len, ...].to(dtype=x.dtype),
+        )
+class NTKScalingRotaryEmbedding(RotaryEmbedding):
+    """RotaryEmbedding extended with fixed and mixed NTK scaling. https://kexue.fm/archives/9706 """
+    def __init__(self, dim, max_position_embeddings=512, base=10000, device=None, scaling_factor=1.0, mixed_b=None):
+        self.scaling_factor = scaling_factor
+        self.mixed_b = mixed_b
+        super().__init__(dim, max_position_embeddings, base, device)
+        max_position_embeddings = max_position_embeddings * self.scaling_factor
+        self._set_cos_sin_cache(max_position_embeddings, self.inv_freq.device, torch.get_default_dtype())
+    def _set_cos_sin_cache(self, seq_len, device, dtype):
+        self.max_seq_len_cached = seq_len
+        if seq_len > self.max_position_embeddings:
+            base = self.base * (self.scaling_factor if self.mixed_b is None else 1)
+            inv_freq = 1.0 / (base ** (torch.arange(0, self.dim, 2).float().to(device) / self.dim))
+            if self.mixed_b is None:
+                inv_freq = inv_freq / self.scaling_factor ** (2 / self.dim)
+            else:
+                a = torch.tensor(self.scaling_factor).log() / (self.dim / 2) ** self.mixed_b
+                lambda_1_m = (a * torch.arange(1, self.dim // 2 + 1).float().to(device) ** self.mixed_b).exp()
+                inv_freq = inv_freq / lambda_1_m
+            self.register_buffer("inv_freq", inv_freq, persistent=False)
+        t = torch.arange(self.max_seq_len_cached, device=device, dtype=torch.float32)
+        freqs = torch.einsum("i,j->ij", t, self.inv_freq)
+        emb = torch.cat((freqs, freqs), dim=-1)
+        self.register_buffer("cos_cached", emb.cos().to(dtype), persistent=False)
+        self.register_buffer("sin_cached", emb.sin().to(dtype), persistent=False)
+class RMSNorm(nn.Module):
+    def __init__(self, hidden_size, eps=1e-6):
+        """
+        RMSNorm is equivalent to T5LayerNorm
+        """
+        super().__init__()
+        self.weight = nn.Parameter(torch.ones(hidden_size))
+        self.variance_epsilon = eps
+    def forward(self, hidden_states):
+        input_dtype = hidden_states.dtype
+        hidden_states = hidden_states.to(torch.float32)
+        variance = hidden_states.pow(2).mean(-1, keepdim=True)
+        hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon)
+        return self.weight * hidden_states.to(input_dtype)
+LAYER_NORM = {
+    'layer_norm': nn.LayerNorm,
+    'rms_norm': RMSNorm
+}
+class VietnameseEmbeddings(nn.Module):
+    """
+    Embedding and Unpadding.
+    """
+    def __init__(self, config: VietnameseConfig):
+        super().__init__()
+        self.padding_idx = config.pad_token_id
+        self.word_embeddings = nn.Embedding(
+            config.vocab_size, config.hidden_size, padding_idx=self.padding_idx
+        )
+        self.position_embedding_type = config.position_embedding_type
+        if self.position_embedding_type == 'absolute':
+            self.position_embeddings = nn.Embedding(
+                config.max_position_embeddings, config.hidden_size, padding_idx=self.padding_idx
+            )
+        elif self.position_embedding_type == 'rope':
+            self._init_rope(config)
+        else:
+            raise ValueError
+        self.type_vocab_size = config.type_vocab_size
+        if self.type_vocab_size > 0:
+            self.token_type_embeddings = nn.Embedding(config.type_vocab_size, config.hidden_size)
+        self.LayerNorm = nn.LayerNorm(config.hidden_size, eps=config.layer_norm_eps)
+        self.dropout = nn.Dropout(config.hidden_dropout_prob)
+        self.register_buffer(
+            "position_ids", torch.arange(config.max_position_embeddings), persistent=False
+        )
+    def _init_rope(self, config):
+        kwargs = dict(
+            dim=int(config.hidden_size / config.num_attention_heads),
+            max_position_embeddings=config.max_position_embeddings,
+            base=config.rope_theta
+        )
+        if config.rope_scaling is None:
+            self.rotary_emb = RotaryEmbedding(**kwargs)
+        else:
+            kwargs.update(scaling_factor=config.rope_scaling["factor"])
+            scaling_type = config.rope_scaling["type"]
+            if scaling_type == 'ntk':
+                kwargs.update(mixed_b=config.rope_scaling.get('mixed_b', None))
+                self.rotary_emb = NTKScalingRotaryEmbedding(**kwargs)
+            else:
+                raise ValueError(f"Unknown RoPE scaling type {scaling_type}")
+    def forward(
+        self,
+        unpad_inputs: bool,
+        input_ids: Optional[torch.Tensor] = None,
+        attention_mask: Optional[torch.Tensor] = None,
+        length: Optional[List[int]] = None,
+        token_type_ids: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.Tensor] = None,
+        inputs_embeds: Optional[torch.Tensor] = None,
+    ) -> Tuple[torch.Tensor, torch.Tensor, Optional[Tuple], Optional[List[int]]]:
+        if inputs_embeds is None:
+            device, input_shape = input_ids.device, input_ids.shape
+        else:
+            device, input_shape = inputs_embeds.device, inputs_embeds.shape[:2]
+        batch_size, seq_length = input_shape
+        if attention_mask is None:
+            attention_mask = torch.ones(input_shape, device=device)
+            if length is not None:
+                for i, l in enumerate(length):
+                    attention_mask[i, l:] = 0
+        if unpad_inputs:
+            attention_mask_bool = attention_mask.bool()
+            if length is None:
+                length = attention_mask.sum(-1).tolist()
+        if inputs_embeds is None:
+            if unpad_inputs:
+                input_ids = input_ids[attention_mask_bool].unsqueeze(0)
+            inputs_embeds = self.word_embeddings(input_ids)
+        else:
+            if unpad_inputs:
+                inputs_embeds = inputs_embeds[attention_mask_bool].unsqueeze(0)
+        embeddings = inputs_embeds
+        if position_ids is None:
+            if seq_length > self.position_ids.size(0):
+                self.register_buffer(
+                    "position_ids", torch.arange(seq_length, device=embeddings.device), persistent=False
+                )
+            if unpad_inputs:
+                position_ids = torch.cat([self.position_ids[:l] for l in length]).unsqueeze(0)
+            else:
+                position_ids = self.position_ids[:seq_length].expand(batch_size, -1)
+        elif unpad_inputs:
+            position_ids = position_ids[attention_mask_bool].unsqueeze(0)
+        if self.position_embedding_type == 'rope':
+            rope_cos, rope_sin = self.rotary_emb(inputs_embeds, seq_len=seq_length)
+            rope_cos = rope_cos[position_ids].unsqueeze(2)
+            rope_sin = rope_sin[position_ids].unsqueeze(2)
+            rope_embeds = rope_cos, rope_sin
+        else:
+            rope_embeds = None
+        if self.type_vocab_size > 0:
+            if token_type_ids is None:
+                token_type_ids = position_ids.mul(0)
+            else:
+                if self.type_vocab_size < 2:
+                    token_type_ids.mul_(0)
+                if unpad_inputs:
+                    token_type_ids = token_type_ids[attention_mask_bool].unsqueeze(0)
+            token_type_embeddings = self.token_type_embeddings(token_type_ids)
+            embeddings = embeddings + token_type_embeddings
+        if self.position_embedding_type == "absolute":
+            position_embeddings = self.position_embeddings(position_ids)
+            embeddings = embeddings + position_embeddings
+        embeddings = self.LayerNorm(embeddings)
+        embeddings = self.dropout(embeddings)
+        return embeddings, attention_mask, rope_embeds, length
+class VietnameseAttention(nn.Module):
+    def __init__(self, config: VietnameseConfig, pack_qkv=None, use_memory_efficient_attention=None):
+        super().__init__()
+        self.config = config
+        if config.hidden_size % config.num_attention_heads != 0 and not hasattr(config, "embedding_size"):
+            raise ValueError(
+                f"The hidden size ({config.hidden_size}) is not a multiple of the number of attention "
+                f"heads ({config.num_attention_heads})"
+            )
+        self.hidden_size = config.hidden_size
+        self.num_attention_heads = config.num_attention_heads
+        self.attention_head_size = int(config.hidden_size / config.num_attention_heads)
+        self.all_head_size = self.num_attention_heads * self.attention_head_size
+        if pack_qkv is None:
+            pack_qkv = config.pack_qkv
+        self.pack_qkv = pack_qkv
+        if self.pack_qkv:
+            self.qkv_proj = nn.Linear(config.hidden_size, self.all_head_size * 3, bias=True)
+        else:
+            self.q_proj = nn.Linear(config.hidden_size, self.all_head_size, bias=True)
+            self.k_proj = nn.Linear(config.hidden_size, self.all_head_size, bias=True)
+            self.v_proj = nn.Linear(config.hidden_size, self.all_head_size, bias=True)
+        self.dropout = nn.Dropout(config.attention_probs_dropout_prob)
+        self.o_proj = nn.Linear(config.hidden_size, config.hidden_size, bias=True)
+        if use_memory_efficient_attention is None:
+            use_memory_efficient_attention = self.config.use_memory_efficient_attention
+        self.use_memory_efficient_attention = use_memory_efficient_attention
+        self.memory_efficient_attention = None if xops is None else xops.memory_efficient_attention
+        if self.use_memory_efficient_attention:
+            assert self.memory_efficient_attention is not None, 'please install xformers'
+    def forward(
+        self,
+        hidden_states: torch.Tensor,
+        attention_bias: torch.FloatTensor,
+        rope_embeds: Optional[Tuple[torch.FloatTensor, torch.FloatTensor]] = None,
+        padding_inputs: Optional[Tuple] = None,
+        attention_scale: Optional[torch.FloatTensor] = None,
+        head_mask: Optional[torch.FloatTensor] = None,
+        output_attentions: Optional[bool] = False,
+        qkv_inputs: Optional[Tuple] = None,
+    ) -> Tuple[torch.Tensor, ...]:
+        shape_hd = (self.num_attention_heads, self.attention_head_size)
+        if self.pack_qkv and qkv_inputs is None:
+            qkv_pack = self.qkv_proj(hidden_states).split(self.all_head_size, dim=-1)
+        else:
+            if qkv_inputs is None:
+                qkv_inputs = (hidden_states, hidden_states, hidden_states)
+            qkv_pack = [
+                getattr(self, n + '_proj')(s) for s, n in zip(qkv_inputs, 'qkv')
+            ]
+        query_states, key_states, value_states = [t.view(t.shape[:-1] + shape_hd) for t in qkv_pack]
+        if self.config.position_embedding_type == 'rope':
+            query_states, key_states = apply_rotary_pos_emb(query_states, key_states, *rope_embeds)
+        dtype = query_states.dtype
+        if self.config.logn_attention_scale and attention_scale is not None:
+            query_states = query_states * attention_scale.to(dtype)
+        if padding_inputs is not None:
+            query_states = pad_input(query_states.squeeze(), *padding_inputs)
+            key_states = pad_input(key_states.squeeze(), *padding_inputs)
+            value_states = pad_input(value_states.squeeze(), *padding_inputs)
+        if self.use_memory_efficient_attention:
+            assert self.memory_efficient_attention is not None, "xformers is not loaded"
+            assert output_attentions is False, "memory_efficient_attention do not output attentions"
+            assert head_mask is None, "Not support yet"
+            attention_probs = None
+            if torch.is_tensor(attention_bias):
+                attention_bias = attention_bias.to(dtype)
+            context_layer = self.memory_efficient_attention(
+                query_states,
+                key_states,
+                value_states,
+                attn_bias=attention_bias,
+                p=self.dropout.p
+            )
+        else:
+            if output_attentions and isinstance(self, VietnameseSdpaAttention):
+                raise RuntimeError("SDPA do not output attentions")
+            context_layer, attention_probs = self._attention(
+                query_states, key_states, value_states, attention_bias, head_mask
+            )
+        if padding_inputs is not None:
+            context_layer = unpad_input(context_layer, indices=padding_inputs[0])
+        new_context_layer_shape = context_layer.size()[:-2] + (self.all_head_size,)
+        context_layer = context_layer.view(new_context_layer_shape)
+        attn_output = self.o_proj(context_layer)
+        outputs = (attn_output, attention_probs) if output_attentions else (attn_output,)
+        return outputs
+    def _attention(self, query_states, key_states, value_states, attention_bias, head_mask):
+        query_states = query_states.transpose(1, 2)
+        key_states = key_states.transpose(1, 2)
+        value_states = value_states.transpose(1, 2)
+        attention_scores = torch.matmul(query_states, key_states.transpose(-1, -2))
+        attention_scores = attention_scores / math.sqrt(self.attention_head_size)
+        if attention_bias is not None:
+            attention_scores = attention_scores + attention_bias
+        attention_probs = nn.functional.softmax(attention_scores, dim=-1)
+        if self.dropout.p > 0:
+            attention_probs = self.dropout(attention_probs)
+        if head_mask is not None:
+            attention_probs = attention_probs * head_mask
+        context_layer = torch.matmul(attention_probs, value_states)
+        context_layer = context_layer.permute(0, 2, 1, 3).contiguous()
+        return context_layer, attention_probs
+class VietnameseSdpaAttention(VietnameseAttention):
+    """
+    Vietnamese attention module using torch.nn.functional.scaled_dot_product_attention. This module inherits from
+    `VietnameseAttention` as the weights of the module stays untouched. The only changes are on the forward pass to adapt to
+    SDPA API.
+    """
+    def __init__(self, config: VietnameseConfig, **kwargs):
+        super().__init__(config, **kwargs)
+    def _attention(self, query_states, key_states, value_states, attention_bias, head_mask):
+        attn_output = torch.nn.functional.scaled_dot_product_attention(
+            query_states.transpose(1, 2),
+            key_states.transpose(1, 2),
+            value_states.transpose(1, 2),
+            attn_mask=attention_bias,
+            dropout_p=self.dropout.p if self.training else 0.0,
+        )
+        attn_output = attn_output.permute(0, 2, 1, 3).contiguous()
+        return attn_output, None
+Vietnamese_ATTENTION_CLASSES = {
+    "eager": VietnameseAttention,
+    "sdpa": VietnameseSdpaAttention,
+}
+class VietnameseGatedMLP(nn.Module):
+    """
+    GLU Variants Improve Transformer.
+    """
+    def __init__(self, config: VietnameseConfig):
+        super().__init__()
+        self.intermediate_size = config.intermediate_size
+        self.up_gate_proj = nn.Linear(config.hidden_size, self.intermediate_size * 2, bias=False)
+        self.down_proj = nn.Linear(self.intermediate_size, config.hidden_size, bias=True)
+        self.act_fn = ACT2FN[config.hidden_act]
+        if config.hidden_dropout_prob > 0:
+            self.hidden_dropout = nn.Dropout(config.hidden_dropout_prob)
+        else:
+            self.hidden_dropout = None
+    def forward(self, hidden_states):
+        up_gate = self.up_gate_proj(hidden_states)
+        up_states, gate = torch.split(up_gate, self.intermediate_size, dim=-1)
+        gate = self.act_fn(gate)
+        gated_states = gate * up_states
+        if self.hidden_dropout is not None:
+            gated_states = self.hidden_dropout(gated_states)
+        down_states = self.down_proj(gated_states)
+        return down_states
+class VietnameseLayer(nn.Module):
+    def __init__(
+        self,
+        config: VietnameseConfig,
+        pack_qkv=None,
+        use_memory_efficient_attention=None,
+        attn_implementation=None
+    ):
+        super().__init__()
+        if attn_implementation is None:
+            attn_implementation = config._attn_implementation
+        if use_memory_efficient_attention is None:
+            use_memory_efficient_attention = config.use_memory_efficient_attention
+        if use_memory_efficient_attention:
+            if attn_implementation != 'eager':
+                logger.warning_once(f"Override {attn_implementation=} to 'eager' as {use_memory_efficient_attention=}")
+                attn_implementation = 'eager'
+        self.attention = Vietnamese_ATTENTION_CLASSES[attn_implementation](
+            config, pack_qkv=pack_qkv, use_memory_efficient_attention=use_memory_efficient_attention
+        )
+        self.mlp = VietnameseGatedMLP(config)
+        ln_class = LAYER_NORM[config.layer_norm_type]
+        self.attn_ln = ln_class(config.hidden_size, eps=config.layer_norm_eps)
+        self.mlp_ln = ln_class(config.hidden_size, eps=config.layer_norm_eps)
+        if config.hidden_dropout_prob > 0:
+            self.hidden_dropout = nn.Dropout(config.hidden_dropout_prob)
+        else:
+            self.hidden_dropout = None
+    def forward(
+        self,
+        hidden_states: torch.Tensor,
+        attention_bias: torch.FloatTensor,
+        rope_embeds: Optional[Tuple[torch.FloatTensor, torch.FloatTensor]] = None,
+        padding_inputs: Optional[Tuple] = None,
+        attention_scale: Optional[torch.FloatTensor] = None,
+        subset_indices: Optional[torch.LongTensor] = None,
+        head_mask: Optional[torch.FloatTensor] = None,
+        output_attentions: Optional[bool] = False,
+        qkv_inputs: Optional[Tuple] = None,
+    ) -> Tuple[torch.Tensor, ...]:
+        residual = hidden_states if qkv_inputs is None else qkv_inputs[0]
+        attention_outputs = self.attention(
+            hidden_states,
+            attention_bias,
+            rope_embeds,
+            padding_inputs,
+            attention_scale,
+            head_mask,
+            output_attentions=output_attentions,
+            qkv_inputs=qkv_inputs,
+        )
+        hidden_states = attention_outputs[0]
+        if self.hidden_dropout is not None:
+            hidden_states = self.hidden_dropout(hidden_states)
+        hidden_states = residual + hidden_states
+        if subset_indices is not None:
+            hidden_states = hidden_states[subset_indices]
+        hidden_states = self.attn_ln(hidden_states)
+        residual = hidden_states
+        hidden_states = self.mlp(hidden_states)
+        if self.hidden_dropout is not None:
+            hidden_states = self.hidden_dropout(hidden_states)
+        hidden_states = residual + hidden_states
+        hidden_states = self.mlp_ln(hidden_states)
+        outputs = (hidden_states,) + attention_outputs[1:]
+        return outputs
+class VietnameseEncoder(nn.Module):
+    def __init__(self, config):
+        super().__init__()
+        self.config = config
+        self.layer = nn.ModuleList([VietnameseLayer(config) for _ in range(config.num_hidden_layers)])
+        self.gradient_checkpointing = False
+    def forward(
+        self,
+        hidden_states: torch.Tensor,
+        attention_bias: Optional[torch.FloatTensor] = None,
+        rope_embeds: Optional[Tuple[torch.FloatTensor, torch.FloatTensor]] = None,
+        padding_inputs: Optional[Tuple] = None,
+        attention_scale: Optional[torch.FloatTensor] = None,
+        subset_indices: Optional[torch.LongTensor] = None,
+        head_mask: Optional[torch.FloatTensor] = None,
+        output_attentions: Optional[bool] = False,
+        output_hidden_states: Optional[bool] = False,
+        return_dict: Optional[bool] = True,
+    ) -> Union[Tuple[torch.Tensor], BaseModelOutput]:
+        all_hidden_states = () if output_hidden_states else None
+        all_self_attentions = () if output_attentions else None
+        for i, layer_module in enumerate(self.layer):
+            if output_hidden_states:
+                all_hidden_states = all_hidden_states + (hidden_states,)
+            if i >= len(self.layer) - 1:
+                layer_subset_indices = subset_indices
+            else:
+                layer_subset_indices = None
+            layer_head_mask = head_mask[i] if head_mask is not None else None
+            if self.gradient_checkpointing and self.training:
+                layer_outputs = self._gradient_checkpointing_func(
+                    layer_module.__call__,
+                    hidden_states,
+                    attention_bias,
+                    rope_embeds,
+                    padding_inputs,
+                    attention_scale,
+                    layer_subset_indices,
+                    layer_head_mask,
+                )
+            else:
+                layer_outputs = layer_module(
+                    hidden_states,
+                    attention_bias,
+                    rope_embeds,
+                    padding_inputs,
+                    attention_scale,
+                    layer_subset_indices,
+                    layer_head_mask,
+                    output_attentions,
+                )
+            hidden_states = layer_outputs[0]
+            if output_attentions:
+                all_self_attentions = all_self_attentions + (layer_outputs[1],)
+        if output_hidden_states:
+            all_hidden_states = all_hidden_states + (hidden_states,)
+        if not return_dict:
+            return tuple(
+                v
+                for v in [
+                    hidden_states,
+                    all_hidden_states,
+                    all_self_attentions,
+                ]
+                if v is not None
+            )
+        return BaseModelOutput(
+            last_hidden_state=hidden_states,
+            hidden_states=all_hidden_states,
+            attentions=all_self_attentions,
+        )
+class VietnamesePooler(nn.Module):
+    def __init__(self, config):
+        super().__init__()
+        self.dense = nn.Linear(config.hidden_size, config.hidden_size)
+        self.activation = nn.Tanh()
+    def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
+        first_token_tensor = hidden_states[:, 0]
+        pooled_output = self.dense(first_token_tensor)
+        pooled_output = self.activation(pooled_output)
+        return pooled_output
+class VietnamesePreTrainedModel(PreTrainedModel):
+    """
+    An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained
+    models.
+    """
+    config_class = VietnameseConfig
+    base_model_prefix = "Vietnamese"
+    supports_gradient_checkpointing = True
+    _supports_sdpa = True
+    def _init_weights(self, module):
+        """Initialize the weights"""
+        if isinstance(module, nn.Linear):
+            module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
+            if module.bias is not None:
+                module.bias.data.zero_()
+        elif isinstance(module, nn.Embedding):
+            module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
+            if module.padding_idx is not None:
+                module.weight.data[module.padding_idx].zero_()
+        elif isinstance(module, nn.LayerNorm):
+            module.bias.data.zero_()
+            module.weight.data.fill_(1.0)
+class VietnameseModel(VietnamesePreTrainedModel):
+    """
+    The bare Vietnamese Model transformer outputting raw hidden-states without any specific head on top.
+    """
+    def __init__(self, config: VietnameseConfig, add_pooling_layer=False):
+        super().__init__(config)
+        self.config = config
+        self.embeddings = VietnameseEmbeddings(config)
+        self.encoder = VietnameseEncoder(config)
+        self.pooler = VietnamesePooler(config) if add_pooling_layer else None
+        self.post_init()
+    def get_input_embeddings(self):
+        return self.embeddings.word_embeddings
+    def set_input_embeddings(self, value):
+        self.embeddings.word_embeddings = value
+    def forward(
+        self,
+        input_ids: Optional[torch.Tensor] = None,
+        attention_mask: Optional[torch.Tensor] = None,
+        length: Optional[List[int]] = None,
+        subset_indices: Optional[torch.LongTensor] = None,
+        token_type_ids: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.Tensor] = None,
+        head_mask: Optional[torch.Tensor] = None,
+        inputs_embeds: Optional[torch.Tensor] = None,
+        output_attentions: Optional[bool] = None,
+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+        unpad_inputs: Optional[bool] = None,
+    ) -> Union[Tuple[torch.Tensor], BaseModelOutputWithPooling]:
+        r"""
+        length  (`list` of length `batch_size`, *optional*):
+            If is `None`, return padded `last_hidden_state`.
+        subset_indices  ():
+            pass
+        unpad_inputs  (`bool`, *optional*):
+            pass
+        """
+        output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
+        output_hidden_states = (
+            output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states
+        )
+        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
+        unpad_inputs = unpad_inputs if unpad_inputs is not None else self.config.unpad_inputs
+        output_padded = length is None
+        if input_ids is not None and inputs_embeds is not None:
+            raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
+        elif input_ids is not None:
+            self.warn_if_padding_and_no_attention_mask(input_ids, attention_mask)
+            input_shape = input_ids.size()
+        elif inputs_embeds is not None:
+            input_shape = inputs_embeds.size()[:-1]
+        else:
+            raise ValueError("You have to specify either input_ids or inputs_embeds")
+        (embedding_output, attention_mask, rope_embeds, length) = self.embeddings(
+            unpad_inputs,
+            input_ids=input_ids,
+            attention_mask=attention_mask,
+            length=length,
+            token_type_ids=token_type_ids,
+            position_ids=position_ids,
+            inputs_embeds=inputs_embeds
+        )
+        batch_size, seq_length = input_shape
+        if unpad_inputs and self.config.use_memory_efficient_attention:
+            attention_bias = xops.fmha.attn_bias.BlockDiagonalMask.from_seqlens(length)
+        else:
+            attention_bias = self.get_extended_attention_mask(attention_mask, input_shape)
+            if self.config.use_memory_efficient_attention:
+                attention_bias = attention_bias.expand(-1, self.config.num_attention_heads, seq_length, -1)
+        padding_inputs = None
+        if unpad_inputs and (output_padded or not self.config.use_memory_efficient_attention):
+            indices = torch.nonzero(attention_mask.flatten(), as_tuple=False).flatten()
+            if not self.config.use_memory_efficient_attention:
+                padding_inputs = (indices, *input_shape)
+        attention_scale = None
+        if self.config.logn_attention_scale:
+            logger.warning_once("TODO: logn_attention_scale")
+        encoder_outputs = self.encoder(
+            embedding_output,
+            attention_bias=attention_bias,
+            rope_embeds=rope_embeds,
+            padding_inputs=padding_inputs,
+            attention_scale=attention_scale,
+            subset_indices=subset_indices,
+            head_mask=head_mask,
+            output_attentions=output_attentions,
+            output_hidden_states=output_hidden_states,
+            return_dict=return_dict,
+        )
+        sequence_output = encoder_outputs[0]
+        if unpad_inputs and output_padded:
+            sequence_output = pad_input(
+                sequence_output.squeeze(), indices, batch_size, seq_length
+            )
+        pooled_output = self.pooler(sequence_output) if self.pooler is not None else None
+        if not return_dict:
+            return (sequence_output, pooled_output) + encoder_outputs[1:]
+        return BaseModelOutputWithPooling(
+            last_hidden_state=sequence_output,
+            pooler_output=pooled_output,
+            hidden_states=encoder_outputs.hidden_states,
+            attentions=encoder_outputs.attentions,
+        )
+class VietnameseLMPredictionHead(nn.Module):
+    def __init__(self, config):
+        super().__init__()
+        self.dense = nn.Linear(config.hidden_size, config.hidden_size)
+        self.transform_act_fn = ACT2FN[config.hidden_act]
+        self.norm = nn.LayerNorm(config.hidden_size, eps=config.layer_norm_eps)
+        self.decoder = nn.Linear(config.hidden_size, config.vocab_size)
+    def forward(self, hidden_states):
+        hidden_states = self.dense(hidden_states)
+        hidden_states = self.transform_act_fn(hidden_states)
+        hidden_states = self.norm(hidden_states)
+        hidden_states = self.decoder(hidden_states)
+        return hidden_states
+class VietnameseForMaskedLM(VietnamesePreTrainedModel):
+    _tied_weights_keys = ["lm_head.decoder.bias", "lm_head.decoder.weight"]
+    def __init__(self, config: VietnameseConfig):
+        super().__init__(config)
+        self.Vietnamese = VietnameseModel(config, add_pooling_layer=False)
+        self.lm_head = VietnameseLMPredictionHead(config)
+        self.loss_fct = nn.CrossEntropyLoss()
+        self.post_init()
+    def get_output_embeddings(self):
+        return self.lm_head.decoder
+    def set_output_embeddings(self, new_embeddings):
+        self.lm_head.decoder = new_embeddings
+    def forward(
+        self,
+        input_ids: Optional[torch.Tensor] = None,
+        attention_mask: Optional[torch.Tensor] = None,
+        token_type_ids: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.Tensor] = None,
+        head_mask: Optional[torch.Tensor] = None,
+        inputs_embeds: Optional[torch.Tensor] = None,
+        labels: Optional[torch.Tensor] = None,
+        output_attentions: Optional[bool] = None,
+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+        unpad_inputs: Optional[bool] = None,
+    ) -> Union[Tuple[torch.Tensor], MaskedLMOutput]:
+        r"""
+        labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*):
+            Labels for computing the masked language modeling loss. Indices should be in `[-100, 0, ...,
+            config.vocab_size]` (see `input_ids` docstring) Tokens with indices set to `-100` are ignored (masked), the
+            loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`
+        """
+        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
+        if labels is None or not self.Vietnamese.config.unpad_inputs:
+            length = None
+            subset_indices = None
+        else:
+            length = attention_mask.sum(-1).tolist()
+            labels = labels[attention_mask.bool()].unsqueeze(0)
+            subset_indices = labels > -100
+        outputs = self.Vietnamese(
+            input_ids,
+            attention_mask=attention_mask,
+            length=length,
+            subset_indices=subset_indices,
+            token_type_ids=token_type_ids,
+            position_ids=position_ids,
+            head_mask=head_mask,
+            inputs_embeds=inputs_embeds,
+            output_attentions=output_attentions,
+            output_hidden_states=output_hidden_states,
+            return_dict=return_dict,
+            unpad_inputs=unpad_inputs,
+        )
+        sequence_output = outputs[0]
+        prediction_scores = self.lm_head(sequence_output)
+        masked_lm_loss = None
+        if labels is not None:
+            if subset_indices is None:
+                mask = attention_mask.bool()
+                prediction_scores = prediction_scores[mask]
+                labels = labels[mask]
+            else:
+                labels = labels[subset_indices]
+            masked_lm_loss = self.loss_fct(prediction_scores, labels)
+        if not return_dict:
+            output = (prediction_scores,) + outputs[2:]
+            return ((masked_lm_loss,) + output) if masked_lm_loss is not None else output
+        return MaskedLMOutput(
+            loss=masked_lm_loss,
+            logits=prediction_scores,
+            hidden_states=outputs.hidden_states,
+            attentions=outputs.attentions,
+        )
+class VietnameseForSequenceClassification(VietnamesePreTrainedModel):
+    def __init__(self, config):
+        super().__init__(config)
+        self.num_labels = config.num_labels
+        self.config = config
+        self.Vietnamese = VietnameseModel(config, add_pooling_layer=True)
+        classifier_dropout = (
+            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
+        )
+        self.dropout = nn.Dropout(classifier_dropout)
+        self.classifier = nn.Linear(config.hidden_size, config.num_labels)
+        self.post_init()
+    def forward(
+        self,
+        input_ids: Optional[torch.Tensor] = None,
+        attention_mask: Optional[torch.Tensor] = None,
+        token_type_ids: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.Tensor] = None,
+        head_mask: Optional[torch.Tensor] = None,
+        inputs_embeds: Optional[torch.Tensor] = None,
+        labels: Optional[torch.Tensor] = None,
+        output_attentions: Optional[bool] = None,
+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+        unpad_inputs: Optional[bool] = None,
+    ) -> Union[Tuple[torch.Tensor], SequenceClassifierOutput]:
+        r"""
+        labels (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
+            Labels for computing the sequence classification/regression loss. Indices should be in `[0, ...,
+            config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If
+            `config.num_labels > 1` a classification loss is computed (Cross-Entropy).
+        """
+        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
+        outputs = self.Vietnamese(
+            input_ids,
+            attention_mask=attention_mask,
+            token_type_ids=token_type_ids,
+            position_ids=position_ids,
+            head_mask=head_mask,
+            inputs_embeds=inputs_embeds,
+            output_attentions=output_attentions,
+            output_hidden_states=output_hidden_states,
+            return_dict=return_dict,
+            unpad_inputs=unpad_inputs,
+        )
+        pooled_output = outputs[1]
+        pooled_output = self.dropout(pooled_output)
+        logits = self.classifier(pooled_output)
+        loss = None
+        if labels is not None:
+            if self.config.problem_type is None:
+                if self.num_labels == 1:
+                    self.config.problem_type = "regression"
+                elif self.num_labels > 1 and (labels.dtype == torch.long or labels.dtype == torch.int):
+                    self.config.problem_type = "single_label_classification"
+                else:
+                    self.config.problem_type = "multi_label_classification"
+            if self.config.problem_type == "regression":
+                loss_fct = nn.MSELoss()
+                if self.num_labels == 1:
+                    loss = loss_fct(logits.squeeze(), labels.squeeze())
+                else:
+                    loss = loss_fct(logits, labels)
+            elif self.config.problem_type == "single_label_classification":
+                loss_fct = nn.CrossEntropyLoss()
+                loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
+            elif self.config.problem_type == "multi_label_classification":
+                loss_fct = nn.BCEWithLogitsLoss()
+                loss = loss_fct(logits, labels)
+        if not return_dict:
+            output = (logits,) + outputs[2:]
+            return ((loss,) + output) if loss is not None else output
+        return SequenceClassifierOutput(
+            loss=loss,
+            logits=logits,
+            hidden_states=outputs.hidden_states,
+            attentions=outputs.attentions,
+        )
+class VietnameseForMultipleChoice(VietnamesePreTrainedModel):
+    def __init__(self, config):
+        super().__init__(config)
+        self.Vietnamese = VietnameseModel(config, add_pooling_layer=True)
+        classifier_dropout = (
+            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
+        )
+        self.dropout = nn.Dropout(classifier_dropout)
+        self.classifier = nn.Linear(config.hidden_size, 1)
+        self.post_init()
+    def forward(
+        self,
+        input_ids: Optional[torch.Tensor] = None,
+        attention_mask: Optional[torch.Tensor] = None,
+        token_type_ids: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.Tensor] = None,
+        head_mask: Optional[torch.Tensor] = None,
+        inputs_embeds: Optional[torch.Tensor] = None,
+        labels: Optional[torch.Tensor] = None,
+        output_attentions: Optional[bool] = None,
+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+        unpad_inputs: Optional[bool] = None,
+    ) -> Union[Tuple[torch.Tensor], MultipleChoiceModelOutput]:
+        r"""
+        labels (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
+            Labels for computing the multiple choice classification loss. Indices should be in `[0, ...,
+            num_choices-1]` where `num_choices` is the size of the second dimension of the input tensors. (See
+            `input_ids` above)
+        """
+        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
+        num_choices = input_ids.shape[1] if input_ids is not None else inputs_embeds.shape[1]
+        input_ids = input_ids.view(-1, input_ids.size(-1)) if input_ids is not None else None
+        attention_mask = attention_mask.view(-1, attention_mask.size(-1)) if attention_mask is not None else None
+        token_type_ids = token_type_ids.view(-1, token_type_ids.size(-1)) if token_type_ids is not None else None
+        position_ids = position_ids.view(-1, position_ids.size(-1)) if position_ids is not None else None
+        inputs_embeds = (
+            inputs_embeds.view(-1, inputs_embeds.size(-2), inputs_embeds.size(-1))
+            if inputs_embeds is not None
+            else None
+        )
+        outputs = self.Vietnamese(
+            input_ids,
+            attention_mask=attention_mask,
+            token_type_ids=token_type_ids,
+            position_ids=position_ids,
+            head_mask=head_mask,
+            inputs_embeds=inputs_embeds,
+            output_attentions=output_attentions,
+            output_hidden_states=output_hidden_states,
+            return_dict=return_dict,
+            unpad_inputs=unpad_inputs,
+        )
+        pooled_output = outputs[1]
+        pooled_output = self.dropout(pooled_output)
+        logits = self.classifier(pooled_output)
+        reshaped_logits = logits.view(-1, num_choices)
+        loss = None
+        if labels is not None:
+            loss_fct = nn.CrossEntropyLoss()
+            loss = loss_fct(reshaped_logits, labels)
+        if not return_dict:
+            output = (reshaped_logits,) + outputs[2:]
+            return ((loss,) + output) if loss is not None else output
+        return MultipleChoiceModelOutput(
+            loss=loss,
+            logits=reshaped_logits,
+            hidden_states=outputs.hidden_states,
+            attentions=outputs.attentions,
+        )
+@dataclass
+class VietnameseTokenClassifierOutput(ModelOutput):
+    loss: Optional[torch.FloatTensor] = None
+    logits: torch.FloatTensor = None
+    last_hidden_state: torch.FloatTensor = None
+    hidden_states: Optional[Tuple[torch.FloatTensor, ...]] = None
+    attentions: Optional[Tuple[torch.FloatTensor, ...]] = None
+class VietnameseForTokenClassification(VietnamesePreTrainedModel):
+    def __init__(self, config):
+        super().__init__(config)
+        self.num_labels = config.num_labels
+        self.Vietnamese = VietnameseModel(config, add_pooling_layer=False)
+        classifier_dropout = (
+            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
+        )
+        self.dropout = nn.Dropout(classifier_dropout)
+        self.classifier = nn.Linear(config.hidden_size, config.num_labels)
+        self.post_init()
+    def forward(
+        self,
+        input_ids: Optional[torch.Tensor] = None,
+        attention_mask: Optional[torch.Tensor] = None,
+        token_type_ids: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.Tensor] = None,
+        head_mask: Optional[torch.Tensor] = None,
+        inputs_embeds: Optional[torch.Tensor] = None,
+        labels: Optional[torch.Tensor] = None,
+        output_attentions: Optional[bool] = None,
+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+        unpad_inputs: Optional[bool] = None,
+    ) -> Union[Tuple[torch.Tensor], VietnameseTokenClassifierOutput]:
+        r"""
+        labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*):
+            Labels for computing the token classification loss. Indices should be in `[0, ..., config.num_labels - 1]`.
+        """
+        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
+        outputs = self.Vietnamese(
+            input_ids,
+            attention_mask=attention_mask,
+            token_type_ids=token_type_ids,
+            position_ids=position_ids,
+            head_mask=head_mask,
+            inputs_embeds=inputs_embeds,
+            output_attentions=output_attentions,
+            output_hidden_states=output_hidden_states,
+            return_dict=return_dict,
+            unpad_inputs=unpad_inputs,
+        )
+        sequence_output = outputs[0]
+        sequence_output = self.dropout(sequence_output)
+        logits = self.classifier(sequence_output)
+        loss = None
+        if labels is not None:
+            loss_fct = nn.CrossEntropyLoss()
+            loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
+        if not return_dict:
+            output = (logits,) + outputs[2:]
+            return ((loss,) + output) if loss is not None else output
+        return VietnameseTokenClassifierOutput(
+            loss=loss,
+            logits=logits,
+            last_hidden_state=sequence_output,
+            hidden_states=outputs.hidden_states,
+            attentions=outputs.attentions,
+        )
+class VietnameseForQuestionAnswering(VietnamesePreTrainedModel):
+    def __init__(self, config):
+        super().__init__(config)
+        self.num_labels = config.num_labels
+        self.Vietnamese = VietnameseModel(config, add_pooling_layer=False)
+        self.qa_outputs = nn.Linear(config.hidden_size, config.num_labels)
+        self.post_init()
+    def forward(
+        self,
+        input_ids: Optional[torch.Tensor] = None,
+        attention_mask: Optional[torch.Tensor] = None,
+        token_type_ids: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.Tensor] = None,
+        head_mask: Optional[torch.Tensor] = None,
+        inputs_embeds: Optional[torch.Tensor] = None,
+        start_positions: Optional[torch.Tensor] = None,
+        end_positions: Optional[torch.Tensor] = None,
+        output_attentions: Optional[bool] = None,
+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+        unpad_inputs: Optional[bool] = None,
+    ) -> Union[Tuple[torch.Tensor], QuestionAnsweringModelOutput]:
+        r"""
+        start_positions (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
+            Labels for position (index) of the start of the labelled span for computing the token classification loss.
+            Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
+            are not taken into account for computing the loss.
+        end_positions (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
+            Labels for position (index) of the end of the labelled span for computing the token classification loss.
+            Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
+            are not taken into account for computing the loss.
+        """
+        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
+        outputs = self.Vietnamese(
+            input_ids,
+            attention_mask=attention_mask,
+            token_type_ids=token_type_ids,
+            position_ids=position_ids,
+            head_mask=head_mask,
+            inputs_embeds=inputs_embeds,
+            output_attentions=output_attentions,
+            output_hidden_states=output_hidden_states,
+            return_dict=return_dict,
+            unpad_inputs=unpad_inputs,
+        )
+        sequence_output = outputs[0]
+        logits = self.qa_outputs(sequence_output)
+        start_logits, end_logits = logits.split(1, dim=-1)
+        start_logits = start_logits.squeeze(-1).contiguous()
+        end_logits = end_logits.squeeze(-1).contiguous()
+        total_loss = None
+        if start_positions is not None and end_positions is not None:
+            if len(start_positions.size()) > 1:
+                start_positions = start_positions.squeeze(-1)
+            if len(end_positions.size()) > 1:
+                end_positions = end_positions.squeeze(-1)
+            ignored_index = start_logits.size(1)
+            start_positions = start_positions.clamp(0, ignored_index)
+            end_positions = end_positions.clamp(0, ignored_index)
+            loss_fct = nn.CrossEntropyLoss(ignore_index=ignored_index)
+            start_loss = loss_fct(start_logits, start_positions)
+            end_loss = loss_fct(end_logits, end_positions)
+            total_loss = (start_loss + end_loss) / 2
+        if not return_dict:
+            output = (start_logits, end_logits) + outputs[2:]
+            return ((total_loss,) + output) if total_loss is not None else output
+        return QuestionAnsweringModelOutput(
+            loss=total_loss,
+            start_logits=start_logits,
+            end_logits=end_logits,
+            hidden_states=outputs.hidden_states,
+            attentions=outputs.attentions,
+        )
+def create_position_ids_from_input_ids(input_ids, padding_idx, past_key_values_length=0):
+    """
+    Replace non-padding symbols with their position numbers. Position numbers begin at padding_idx+1. Padding symbols
+    are ignored. This is modified from fairseq's `utils.make_positions`.
+    Args:
+        x: torch.Tensor x:
+    Returns: torch.Tensor
+    """
+    # The series of casts and type-conversions here are carefully balanced to both work with ONNX export and XLA.
+    mask = input_ids.ne(padding_idx).int()
+    incremental_indices = (torch.cumsum(mask, dim=1).type_as(mask) + past_key_values_length) * mask
+    return incremental_indices.long() + padding_idx

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+    "max_seq_length": 8192,
+    "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "<mask>",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:aa7a6ad87a7ce8fe196787355f6af7d03aee94d19c54a5eb1392ed18c8ef451a
+size 17082988

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,62 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "250001": {
+      "content": "<mask>",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "<s>",
+  "eos_token": "</s>",
+  "extra_special_tokens": {},
+  "mask_token": "<mask>",
+  "max_length": 8192,
+  "model_max_length": 8192,
+  "pad_to_multiple_of": null,
+  "pad_token": "<pad>",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "</s>",
+  "stride": 0,
+  "tokenizer_class": "XLMRobertaTokenizerFast",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "<unk>"
+}